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ALTERED ANTIBODIES HAVING 
IMPROVED ANTIGEN-BINDING AFFINITY 



Related Information 
The application claims priority to U.S. provisional patent application number 
60/490,087, filed on July 26, 2003, the entire contents of which are hereby incorporated 
by reference. 

The contents of any patents, patent applications, and references cited throughout 
this specification are hereby incorporated by reference in their entireties. 

Background o^the Invention 

Antibodies are exquisite, naturally occurring biological agents that play a critical 
role in defending the body from pathogens. Antibodies, which are also commonly 
referred to as immunoglobulins, contain four polypeptides: two longer polypeptides 
("heavy chains") that are identical to one another and two shorter polypeptides ("light 
chains") that are identical to one another. The heavy chains are paired with the light 
chains by disulfide bonds, and the two heavy chains are similarly bound to one another to 
create a tetrameric structure. Moreover, the heavy and light chains each contain a 
variable domain and one or more constant regions: the heavy chain includes one variable 
domain (V H ) followed by three constant regions (CjH, C 2 H, and C 3 H), and the light chain 
includes one variable domain (V L ) followed by a single constant region (C L ). 

The variable domains of each pair of light and heavy chains form the site that 
comes into contact with an antigen. Both V H and V L have the same general structure, 
with four framework regions (FRs), whose sequences are relatively conserved, connected 
by three hypervariable or complementarity determining regions (CDRs) (see Kabat et al, 
In "Sequences of Proteins of Immunological Interest," U.S. Department of Health and 
Human Services, 1983; see also Chothia et al, J. Mol. Biol. 196:901-917, 1987). The 
four framework regions largely adopt a p-sheet conformation and the CDRs form loops 
connecting, and in some cases forming part of, the p-sheet structure. The CDRs of V H 
and V L are held in close proximity by the FRs, and amino acid residues within the CDRs 
bind the antigen. More detailed accounts of the structure of variable domains can be 
found in Poljak et al. (Proc. Natl. Acad. Sci. USA 70:3305-3310, 1973) Segal et al. (Proc. 
Natl. Acad Sci. USA 71:4298-4302, 1974), and Marquart et al (J. Mol. Biol., 141 :369- 
391, 1980). 

Researchers have modified antibodies in various ways in order to study their 
function or to improve their utility as therapeutic agents. In some of the earliest 
modifications, researchers used double-stranded DNA sequences to express the V H or V L 
domains, but none of the sequence of the constant region (see, e.g., EP-A-0 088 994; 
Schering Corporation). Other fragments and chimeric antibodies have also been made. 
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One particular type of chimera, commonly referred to as a CDR-grafted antibody, 
includes sequences from two antibodies that differ in species (e.g., murine CDRs have 
been used in place of the naturally occurring CDRs in otherwise human antibodies; see, 
e.g., U.S. Patent No. 5,225,539). Researchers hoped that such antibodies would be no 
5 more foreign to the human body than a genuine human antibody, but the utility of such 
antibodies has been restricted, at least in some cases, by a reduction in the antibody's 
affinity for the antigen. In an attempt to improve affinity, some of the amino acids in the 
FRs of CDR-grafted antibodies have been changed from those of the acceptor molecule 
(e.g., a human antibody) to those of the antibody that donated the CDRs (e.g., those of a 
10 murine antibody; see, e.g., U.S. Patent No. 5,585,089; U.S. Patent No. 5,693,761; U.S. 
Patent No. 5,693,762; and U.S. Patent No. 6,1 80,370). 

Accordingly, there remains a need for antibodies that do not provoke a strong 
immune response but yet bind strongly to their antigens and methods for identifying such 
antibodies. 

15 

Summary of the Invention 
The present invention is based, in part, on the discovery that the affinity of an 
antibody (or an antigen-binding fragment thereof) can be improved by modifying amino 
acidTesidues wi1teiff1tee^tibody.~Thrm whollyor partially,~on~a 

20 computational analysis of electrostatic forces between the antibody and an antigen to 
which it binds. The computational analysis, in turn, is based on a prediction of charge 
distribution within the antibody that generates the electrostatic forces that influence 
binding between the antibody and its antigen in a solvent (e.g., an aqueous solvent such as 
. water, phosphate-buffered saline (PBS), plasma, or blood). The computational methods 

25 define the electrostatic complement (the optimal tradeoff between unfavorable 

desolvation energy and favorable interactions in an antigen-antibody complex) for a given 
target site and geometry. 

In particular, the invention provides criteria or rules by which one can calculate 
the optimal charge distribution and associated change in binding free energy between an 

30 antibody and an antigen, when bound in a solvent, and then identify discrete residue 
positions for modification. Moreover, the invention provides rules which guide the 
selection of an appropriate modification at the identified residue position, e.g., side chain 
chemistry, by building a subset of modifications in silico followed by recalculating the 
binding free energy and election of a preferred modification. 

35 Thus, the invention has several advantages in that it, unlike other methods, is not 

restricted to mere global or pair wise alignment of charges with the presumptive 
conclusion that only opposite net charges between an antibody and antigen are favorable. 
Rather, the invention provides a more sophisticated analysis (as is appropriate given that a 
typical antibody comprises up to four polypeptide chains with inter and intra chain 
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disulfide linkages and six CDR binding surfaces as well as inter chain interfaces) for 
revealing the exact residue positions and side chain chemistries to be used to modify the 
binding-affinity of an antibody/antigen complex. 

Moreover, the invention also fully accounts for the binding interactions of a 
5 antibody when bound to an antigen within a solvent. 

And importantly, the invention provides for antibody modifications that alter 
antigen-binding which other methods would either fail to identify or dismiss as unsuitable 
to try. 

In one aspect, the invention features a method of modulating the antigen-binding 

1 0 affinity of an antibody that includes the steps of providing data corresponding to the 
structure (e.g., a three-dimensional structure) of a complex between an antibody and an 
antigen to which the antibody binds; determining, using the data, a representation of a 
charge distribution (e.g., a set of multipoles or point charges) within the antibody (e.g., 
within one or more of the CDRs) that would reduce (i.e., op timiz e or make more 

1 5 negative) the electrostatic contribution to binding free energy between the antibody and 
the antigen; and modifying one or more amino acid residues within the antibody (e.g., 
within one or more of the CDRs) to create a modified antibody corresponding to (or with 
abetter correspondence to) the charge distribution (i.e., the optimal charge distribution 
determined). The resulfls a~charge disffibutibn thatcan be u^edl^odulate~(^ 

20 improve, alter, etc.) the interaction between an antibody and its antigen. For example, if 
the side chain of an amino acid residue in an optimized antibody that has a net total 
charge of -1, one can replace the corresponding amino acid residue in the original 
antibody, sometime referred to as the first antibody or parent antibody, with an amino 
acid residue that has a negatively charged side chain to create a modified antibody which 

25 is a variant of the parent antibody and sometimes referred to herein as a second antibody 
(or even a third or fourth antibody if referring to the modification of a antibody that has 
been previously modified and is therefore an iterative variation of the preceding 
antibody). 

In a related aspect, the invention provides a method of modulating the antigen- 
30 binding affinity of an antibody by determining a spatial representation of an optimal 
charge distribution of the amino acids of the antibody and associated change in binding 
free energy of the antibody when bound to an antigen in a solvent; identifying at least one 
candidate amino acid residue position of the antibody to be modified to alter the binding 
free energy of the antibody when bound to the antigen; and selecting an elected amino 
35 acid residue for substitution for said amino acid position, such that upon substitution, the 
antigen-binding affinity of the antibody is modulated. 

As described further below, once a charge distribution is determined, one or more 
of the amino acid residues in the antibody (e.g., one or more of the residues in the 
CDR(s), e.g., 2-10 residues or more, e.g., most if not all of the CDR residues and, 
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optionally, only in the CDR(s)) can be modified to match, or better match, that charge 
distribution. For example, an amino acid residue can be replaced with another naturally 
occurring amino acid residue or a non-naturally occurring residue. The substitution may 
or may not constitute a conservative amino acid substitution. In some instances, it may 
5 be desired to alter the charge distribution by deleting or inserting one or more amino acid 
residues. 

In some instances, for example, where the data of the structure of a complex 
between the antibody and the antigen is available prior to provision of the antibody, one 
need only know the sequence of the parent antibody (or the sequence of one or more of 

1 0 the CDRs of that antibody). The method can be carried out so long as one has, or can 

obtain, information regarding the charge distribution within an antibody-antigen complex 
containing a parent antibody; that information is then used to modify a modified antibody 
in a way that improves the modified antibody's affinity for its antigen. Alternatively, the 
methods of the invention can be used to alter (e.g., optimize) the affinity of a fully human 

15 antibody or antigen-binding fragments containing human FRs and human CDRs, for 
example, affinity mature the antibody for improved antigen-binding affinity. A fully 
human antibody can be one obtained from human plasma (even though this is an 
uncommon practice) or generated in vivo (e.g., an antibody generated in a transgenic 
mouse containing human immunoglobulin genes; see U.S. Patent No. 6,150,584). 

20 In the methods of the invention, the parent and modified antibodies can be of the 

same or of different species (e.g., the parent antibody can be a non-human antibody (e.g., 
a murine antibody), and the modified antibody can be a human antibody). The antibodies 
can also be of the same, or of different, classes or subclasses. Regardless of their origin 
or class, portions of the sequences of the two antibodies can be identical to one another. 

25 For example, the FRs of the parent antibody can be identical to the FRs of the modified 
antibody. This would occur, for example, where the parent antibody is a human antibody 
and the modified antibody varies from the parent antibody only in that the modified 
antibody contains one or more non-human CDRs (i.e., in the modified antibody, one or 
more of the original, human CDRs have been replaced with a non-human (e.g., murine) 

30 CDR). 

The methods of the invention can be carried out with antibodies that have the 
structure of a naturally occurring antibody. For example, the methods of the invention 
can be carried out with antibodies that have the structure of an IgG molecule (two full- 
length heavy chains and two full-length light chains). Thus, in some embodiments, the 
35 parent and/or modified antibody can include an Fc region of an antibody (e.g. , the Fc 
region of a human antibody). The methods of the invention can be carried out, however, 
with less than complete antibodies; they can be carried out with any antigen-binding 
fragment of an antibody including those described further below (Fab fragments, F(ab') 2 
fragments, or single-chain antibodies (scFv)). The "fragments" can constitute minor 
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variations of naturally occurring antibodies. For example, an antibody fragment can 
include all but a few of the amino acid residues of a "complete" antibody (e.g., the FR of 
V H or V L can be truncated). 

» 

Regardless of whether the method is carried out with a complete antibody or a 
5 fragment thereof, where all or part of the FR is present, the sequence of that FR can be 
that of a wild-type antibody. Alternatively, the FR can contain a mutation. For example, 
the methods of the invention can be carried out with a parent antibody that includes a 
framework region {e.g., a human FR) that contains one or more amino acid residues that 
differ from the corresponding residue(s) in the wild-type FR. The mutation can be one 
10 that changes an amino acid residue to the corresponding residue in an antibody of another 
species. Thus, an otherwise human FR can contain a murine residue (such mutations are 
referred to in the art as "back mutations"). For example, framework regions of a human 
antibody can be "back-mutated" to the amino acid residue at the same position in a non- 
human antibody. Such a back-mutated antibody can be used in the present methods as the 

15 "parent" antibody, in which case the "modified" antibody can include completely human 
FRs. Mutations in the FRs can occur within any of FR1, FR2, FR3, and/or FR4 in either 
V H or V L (or in V H and V L ). Up to about 10 residues or more can be mutated {e.g., 1, 2, 
3, 4, 5, 6, 7, 8, 9, or 10 or more residues in FR1, FR2, FR3, and/or FR4 can be changed 
from the naturally occurring residue {e.g. , the human residue) to another residue {e.g. , a 

20 donor residue, for example, murine residue, at the corresponding position)). The residues 
that immediately flank the CDRs are among those that can be mutated. 

In one embodiment, the methods of the invention are carried out with a parent 
antibody that is completely non-human {e.g., a murine antibody) and a modified antibody 
that includes a human Fc region and completely human FRs. 

25 In certain embodiments, the relative affinities of the parent and modified 

antibodies {e.g., the parent, modified or altered antibody of the present invention) can be 
such that the affinity of the modified antibody to a given antigen is at least as high as the 
affinity of the parent antibody to that antigen. For example, the affinity of the modified 
antibody to the antigen can be at least (or about) 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 

30 3, 5, 8, 10, 50, 10 2 , 10 3 , 10 4 , 10 5 , or 10 6 , 10 7 , or 10 8 times greater than the affinity of the 
parent antibody to the antigen (or any range or value in between). 

The method may also be used lower the affinity of the antibody, for example, 
where it is desirable to have a lower affinity for better pharmacokinetics, antigen-binding 
specificity, reduced cross-talk between related antigen epitopes, and the like. For 

35 example, the affinity of the modified antibody to the antigen can be at least (or about) 1.1, 
1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 3, 5, 8, 10, 50, 10 2 , 10 3 , 10 4 , 10 5 , or 10 6 , 10 7 , or 10 8 
times less than the affinity of the parent antibody to the antigen (or any range or value in 
between). . 



-5- 



WO 2005/011376 



PCT/US2004/024200 



The methods of the invention can be iterative. An antibody generated, as 
described above, can be re-modeled (for example, in silico or empirically, e.g., using 
experimental data) and further altered to further improve antigen binding. Thus, the steps 
described above can be followed by additional steps, including: obtaining data 
5 corresponding to the structure of a complex between the modified antibody and the 
antigen; determining, using the data (which can be referred to as "additional data" to 
distinguish it from the data obtained and used in the parent "round"), a representation of 
an additional charge distribution of the CDRs of the modified antibody which minimizes 
electrostatic contribution to binding free energy between the modified antibody and the 

1 0 antigen; and expressing a third or further modified antibody that binds to the antigen, the 
third antibody having a matured CDR differing from a CDR of the modified antibody by 
at least one amino acid, the matured CDR corresponding to the additional charge 
distribution. Yet additional rounds of maturation can be carried out In the method just 
described, the resulting antibody would be complexed with (i.e. allowed to bind to) 

1 5 antigen and used to obtain a charge distribution that minimizes the electrostatic 

contribution. A fourth or further modified antibody would then be produced that would 
contain modifications, dictated by the charge distribution, that improve antigen binding. 
And so forth. 

As noted above, the modified antibody (or subsequent antibodies serving in the 

20 place of the modified antibody) can contain a CDR that has been modified so that the 
electrostatic forces in the antibody-antigen complex are improved (or optimized). 
Presently, the software used to examine electrostatic forces models an optimal charge 
distribution and the user then determines what amino acid substitution(s) or alteration(s) 
would improve that distribution. Accordingly, such steps (e.g., examining the modeled, 

25 optimal charge distribution and determining a sequence modification to improve antigen 
binding) are, or can be, part of the methods now claimed. However, as it would not be 
difficult to modify the software so that the program includes the selection of amino acid 
substitutions (or alterations), in the future, one may need only examine that output and 
execute the suggested change (or some variation of it, if desired). 

30 The methods of the invention may be characterized as those that "produce" an 

antibody (or a fragment thereof). The term "produce" means to "make," "generate," or 
"design" a non-naturally occurring antibody (or fragment thereof). The antibody 
produced may be considered more "mature" than either of the antibodies whose 
sequences (e.g., whose CDR(s) and FRs) were used in its construction. While the 

35 antibody produced may have a stronger affinity for an antigen, the methods of the 

invention are not limited to those that produce antibodies with improved affinity. For 
example, the methods of the invention can produce an antibody that has about the same 
affinity for an antigen as it did prior to being modified by the present methods. When a 
human antibody is modified, as described in the prior art, to contain murine CDRs, the 
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resulting CDR-grafted antibody can lose affinity for its antigen. Thus, for example, 
where the methods of the invention are applied to CDR-grafted antibodies, they are useful 
and successful when they prevent the loss of affinity (some or all of the loss) that would 
otherwise occur with a conventional CDR graft. 

In addition to minimizing the electrostatic contribution to the binding free energy, 
the methods of the invention can further include minimizing the van der Waals or solvent 
accessible surface area contribution to the binding free energy. In such further 
computational analysis, additional amino acids in a CDR of the parent antibody may be 
altered to generate the modified antibody, such that the binding free energy is further 
reduced beyond what was achieved by solely nnnimizing the electrostatic contribution. 
As few as one and as many as 50 CDR residues may be modified in the methods and 
■compositions of the instant invention. Most commonly, between 1 and 10 (e.g., 1, 2, 3, 4, 
5, 6, 7, 8, 9, or 10) amino acid residues are altered by the methods and compositions of 
the instant invention. 

Antibodies produced by any of the methods of the invention are also within the 
scope of the invention, pharmaceutical compositions containing those antibodies, as well 
as nucleic acids encoding such antibodies. The present invention also includes vectors 
that express the modified antibodies (or polypeptides or fragments thereof) found by the 
methods described above. These vectors can be used to transform cell lines, and such 
transformed (e.g. transfected) cells are within the scope of the invention. 

The details of one or more embodiments of the invention are set forth in the 
description below. Other features, objects, and advantages of the invention will be 
apparent from the description and the claims. 
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Brief Description of the Figures 

Figure 1 illustrates geometries for modeling the binding interactions between an 
antibody, or antigen-binding fragment thereof, and an antigen, when bound in a solvent 
5 (top panel). In particular, the boundary-value problem which comprises a determination 
of the charge distribution in a spherical region of radius R with a dielectric constant G u 
surrounded by solvent with a dielectric constant G 2 as well as other geometries of the 
antibody-antigen interface (bottom panel, see also text, infra). 

Figure 2 depicts nucleotide (SEQ ID NOs: 1, 3) and polypeptide (SEQ ID NOs: 2, 
10 4) sequences for 5c8 heavy variable and light variable chain domains. 

Detailed Description of the Invention 
In order to provide a clear understanding of the specification and claims, the 
following definitions are conveniently provided below. 

15 

Definitions 

The term "structure", or "structural data", as used herein, includes the known, 
predicted and/or modeled position(s) in three-dimensional space that are occupied by the 
atoms, molecules, compounds, amino acid residues and portions thereof, and 
20 macromolecules and portions thereof, of the invention, and, in particular, an antibody 
bound to an antigen in a solvent. A number of methods for identifying and/or predicting 

T 

structure at the molecular/atomic level can be used such as X-ray crystallography, NMR 
structural modeling, and the like. 

The term "binding affinity", as used herein, includes the strength of a binding 

25 interaction and therefore includes both the actual binding affinity as well as the apparent 
binding affinity. The actual binding affinity is a ratio of the association rate over the 
disassociation rate. Therefore, conferring or optimizing binding affinity includes altering 
either or both of these components to achieve the desired level of binding affinity. The 
apparent affinity can include, for example, the avidity of the interaction. For example, a 

30 bivalent altered variable region binding fragment can exhibit altered or optimized binding 
affinity due to its valency. Binding affinities may also be modeled, with such modeling 
contributing to selection of residue alterations in the methods of the current invention. 

The term "binding free energy" or "free energy of binding", as used herein, 
includes its art-recognized meaning, and, in particular, as applied to antibody-antigen 

35 interactions in a solvent. Reductions in binding free energy enhance antibody-antigen 
affinities, whereas increases in binding free energy reduce antibody-antigen affinities. 

The phrase "spatial representation of an optimal charge distribution", as used 
herein, includes modeling the charge distribution for an antibody or antibody-antigen 
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complex, wherein the electrostatic contribution to free energy of the antibody when 
bound to antigen is optimized (minimized), as compared to the known and/or modeled 
representation of charge distribution of the parent antibody and/or parent antibody when 
bound to antigen. The modeling of optimal charge distribution can be arrived at by an in 
silico process that incorporates the known and/or modeled structure(s) of an antibody 
and/or antibody-antigen complex as an input. Response continuum modeling (e.g., the 
linearized Poisson-Boltzmann equation) can be employed to express the electrostatic 
binding free energy of the antigen-antibody complex in a solvent as a sum of antibody 
desolvation, antibody-antigen interaction, and antigen desolvation terms. This in silico 
process is characterized by the ability to incorporate monopole, dipolar, and quadrupolar 
terms in representing charge distributions within the modeled charge distributions of the 
invention, and allows for extensive assessment of solvation/desolvation energies for 
antibody residues during transition of the antibody between unbound and bound states. 
The process of modeling the spatial representation of an optimal charge distribution for 
an antibody-antigen complex may additionally incorporate modeling of van der Waals 
forces, solvent accessible surface area forces, etc. 

The term "solvent", as used herein, includes its broadest art-recognized meaning, 
referring to any liquid in which an antibody of the instant invention is dissolved and/or 
resides. 

The term "antibody", as used herein, includes monoclonal antibodies (including 
frill length monoclonal antibodies), polyclonal antibodies, multispecific antibodies (e.g., 
bispecific antibodies), chimeric antibodies, CDR-grafted antibodies, humanized 
antibodies, human antibodies and antigen-binding fragments thereof, for example, an 
antibody light chain (VL), an antibody heavy chain (VH), a single chain antibody (scFv), 
a F(ab')2 fragment, a Fab fragment, an Fd fragment, an Fv fragment, and a single domain 
antibody fragment (DAb). 

The term "antigen", as used herein, includes an entity (e.g., a proteinaceous entity 
or peptide) to which an antibody specifically binds, and includes, e.g., a predetermined 
antigen to which both a parent antibody and modified antibody as herein defined bind. 
The target antigen may be polypeptide, carbohydrate, nucleic acid, lipid, hapten, or other 
naturally occurring or synthetic compound. Preferably, the target antigen is a 
polypeptide. 

The term "CDR", as used herein, includes the complementarity determining 
regions as described by, for example Kabat , Chothia, or MacCallum et al, (see, e.g., 
Kabat etaljn "Sequences of Proteins of Immunological Interest," U.S. Department of 
Health and Human Services, 1983; Chothia et al,J. Mol Biol. 196:901-917, 1987; and 
MacCallum et al, J. Mol. Biol 262:732-745 (1996); the contents of which are 
incorporated herein in their entirety). 

The amino acid residue positions which typically encompass the CDRs as 
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described by each of the above cited references are set forth below for comparison. 



Table of CDR Definitions 




Kabat 


Chothia 


MacCallum 


V H CDR1 


31-35 


26-32 


30-35 


V H CDR2 


50-65 


53-55 


47-58 


V H CDR3 


95-102 


96-101 


93-101 


V L CDR1 


24-34 


26-32 


' 30-36 


V L CDR2 


50-56 


50-52 


46-55 


V L CDR3 


89-97 


91-96 


89-96 



The term "variable region' 5 , as used herein, includes the amino terminal portion of 
5 an antibody which confers antigen binding onto the molecule and which is not the 
constant region. The term is intended to include functional fragments, for example, 
antigen-binding fragments, which maintain some or all of the binding function of the 
whole variable region. 

The term "framework region", as used herein, includes the antibody sequence that 
10 is between and separates the CDRs; Therefore, a variable region framework is between 
" about 100-120 amino acids in length but is intended to reference only those amino acids 
' outside of the CDRs. For the specific example of a heavy chain variable region and for 
the CDRs as defined by Kabat et al 9 framework region 1 corresponds to the domain of 
the variable region encompassing amino acids 1-30; region 2 corresponds to the domain 
15 of the variable region encompassing amino acids 36-49; region 3 corresponds to the 

domain of the variable region encompassing amino acids 66-94, and region 4 corresponds 
to the domain of the variable region from amino acids 103 to the end of the variable 
region. The framework regions for the light chain are similarly separated by each of the 
. light claim variable region CDRs. Similarly, using the definition of CDRs by Chothia et 
20 al or McCallum et al. the framework region boundaries are separated by the respective 
CDR termini as described above. 

The term terms "modified" or "altered", as used herein, include antibodies or 
antigen-binding fragments thereof, that contain one or more amino acid changes in, for 
example, a CDR(s), a framework region(s), or both as compared to the parent amino acid 
25 sequence at the changed position. A modified or altered antibody typically has one or 
more residues which has been substituted with another amino acid residue, related side 
chain chemistry thereof, or one or more amino acid residue insertions or deletions. 

The term "parent antibody", "original antibody", "starting antibody", "wild-type", 
or "first antibody", as used herein, includes any antibody for which modification of 
30 antibody-antigen binding affinity by the methods of the instant invention is desired; 
Thus, the parent antibody represents the input antibody on which the methods of the 
instant invention are performed. The parent polypeptide may comprise a native sequence 
{i.e. a naturally occurring) antibody (including a naturally occurring allelic variant), or an 
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antibody with pre-existing amino acid sequence modifications (such as insertions, 
deletions and/or other alterations) of a naturally occurring sequence. The parent antibody 
may be a monoclonal, chimeric, CDR-grafted, humanized, or human antibody. 

The terms "antibody variant", "modified antibody", "antibody containing a 
modified amino acid", "mutant", or "second antibody", "third antibody", etc., as used 
herein, include an antibody which has an amino acid sequence which differs from the 
amino acid sequence of a parent antibody. Preferably, the antibody variant comprises a 
heavy chain variable domain or a light chain variable domain having an amino acid 
sequence which is not found in nature. Such variants necessarily have less than 100% 
sequence identity or similarity with the parent antibody. In a preferred embodiment, the 
antibody variant will have an amino acid sequence from about 75% to less than 100% 
amino acid sequence identity or similarity with the amino acid sequence of either the 
heavy or light chain variable domain of the parent antibody, more preferably from about 
80% to less than 100%, more preferably from about 85% to less than 100%, more pref- 
erably from about 90% to less than 100%, and most preferably from about 95% to less 
than 100%. Identity or similarity with respect to this sequence is defined herein as the 
percentage of amino acid residues in the candidate sequence that are identical (i.e. same 
residue) with the parent antibody residues, after aligning the sequences and introducing 
gaps, if necessary, to achieve the maximum percent sequence identity. Typically, N- 
terminal, C- terminal, or internal extensions, deletions, or insertions into the antibody 
sequence outside of the variable domain are not construed as affecting sequence identity 
or similarity. The antibody variant is generally one which comprises one or more amino 
acid alterations in or adjacent to one or more hypervariable regions thereof. The modified 
antibodies of the present invention may either be expressed, or alternatively, may be 
modeled in silico. 

The phrase "candidate amino acid residue position", as used herein, includes an 
amino acid position identified within an antibody of the present invention, wherein the 
substitution of the candidate amino acid is modeled, predicted, or known to impact charge 
distribution of the antibody upon alteration, deletion, insertion, or substitution with 
another amino acid. 

The term "elected amino acid", as used herein, refers to an amino acid residue(s) 
that has been selected by the methods of the present invention for substitution as a 
replacement amino acid at the candidate amino acid position within the antibody. 
Substitution of the candidate amino acid residue position with the elected amino acid 
residue may either reduce or increase the electrostatic contribution to binding free energy 
of the antibody-antigen complex. 

The terms "amino acid alteration" or "alteration for said amino acid", as used 
herein, include refers to a change in the amino acid sequence of a predetermined amino 
acid sequence. Exemplary alterations include insertions, substitutions, and deletions. 
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The term "amino acid modification", as used herein, includes the replacement of 
an existing amino acid residue side chain chemistry in a predetermined amino acid 
sequence with another different amino acid residue side chain chemistry, by, for example, 
amino acid substitution. Individual amino acid modifications of the instant invention are 
5 selected from any one of the following: (1) the set of amino acids with nonpolar 

sidechains, e.g., Ala, Cys, He, Leu, Met, Phe, Pro, Val, (2) the set of amino acids with 
negatively charged side chains, e.g., Asp, Glu, (3) the set of amino acids with positively 
charged sidechains, e.g., Arg, His, Lys, and (4) the set of amino acids with uncharged 
polar sidechains, e.g., Asn, Cys, Gin, Gly, His, Met, Phe, Ser, Thr, Trp, Tyr, to which are 
1 o added Cys, Gly, Met and Phe. 

The term "naturally occurring amino acid residue", as used herein, includes one 
encoded by the genetic code, generally selected from the group consisting of: alanine 
(Ala); arginine (Arg); asparagine (Asn); aspartic acid (Asp); cysteine (Cys); glutamine. 
(Gin); glutamic acid (Glu); glycine (Gly); histidine (His); isoleucine (He): leucine (Leu); . 
15 lysine (Lys); methionine (Met); phenylalanine (Phe); proline (Pro); serine (Ser); threonine 
(Thr); tryptophan (Trp); tyrosine (Tyr); and valine (Val). 

The term "non-naturally occurring amino acid residue", as used herein, includes 
, an amino acid residue other than those naturally occurring amino acid residues listed 
above, which is able to covalently bind adjacent amino acid residues(s) in a polypeptide 
20 < ■ chain. Examples of non-naturally occurring amino acid residues include norleucine, 
ornithine, norvaline, homoserine and other amino acid residue analogues such as those 
described in Ellman et al. Meth. Enzym. 202:301-336 (1991). To generate such non- 
naturally occurring amino acid residues, the procedures of Noren et al Science 244:182 
(1989) and Ellman et al , supra, can be used. Briefly, these procedures involve 
25 chemically activating a suppressor tRNA with a non-naturally occurring amino acid 
residue followed by in vitro transcription and translation of the RN A. 

The term "exposed" amino acid residue, as used herein, includes one in which at 
least part of its surface is exposed, to some extent, to solvent when present in a 
polypeptide {e.g., an antibody or polypeptide antigen) in solution. Preferably, the 
30 exposed amino acid residue is one in which at least about one third of its side chain 
surface area is exposed to solvent. Various methods are available for determining 
whether a residue is exposed or not, including an analysis of a molecular model or 
structure of the polypeptide. 

The term "treatment" refers to both therapeutic treatment and prophylactic or 

35 preventative measures. Those in need of treatment include those already with the 

■ 

disorder as well as those in which the disorder is to be prevented. 

The term "disorder or disease" is any condition that would benefit from treatment 
with the antibody variant. This includes chronic and acute disorders or diseases including 
those pathological conditions which predispose the mammal to the disorder in question. 
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The terms "cell", "cell line", "cell culture", or "host cell", as used herein, includes 
"transformants", "transformed cells", or "transfected cells" and progeny thereof. Host 
cells within the scope of the invention include prokaryotic cells such as E. coli, lower 
eukaryotic cells such as yeast cells, insect cells, and higher eukaryotic cells such as 
5 vertebrate cells, for example, mammalian cells, e.g. , Chinese hamster ovary cells and NSO 
myeloma cells. 

Detailed Description 

Overview 

10 The methods described herein can be used to obtain an optimized antibody (or an 

antigen-binding fragment thereof). Based on a computational analysis, positions are 
identified within any given antibody where there is a difference (the larger the difference, 
the more significant it can be) between the charge distribution in an optimized antibody- 
antigen complex and that in an original antibody-antigen complex. Such differences in 

1 5 charge distribution are also associated with changes in binding free energy of the antibody 
when bound to the antigen in a solvent. The amino acid residue at such a position can 
then be changed so that the electrostatic forces in the original antibody more nearly 
■ approach (or in alternative embodiments, are more divergent from) those in the optimized 
antibody, thereby modulating binding free energy of the antibody when bound to an 

20 antigen in a solvent. Changes to the antibody are introduced according to a set of discrete 
criteria or rules as described herein. 

Rules for Modifying Antibodies for Improved Function 

The rules of the invention can be applied as follows. To modulate the antigen- 

25 binding affinity of an antibody, for example, to improve or restore such binding, basic 
sequence and/or structural data is first acquired. Electrostatic charge optimization 
techniques are then applied to suggest improved-affinity mutants. Typically, an 
electrostatic charge optimization is first used to determine the position(s) of the CDR 
residue(s) that are sub-optimal for binding (Lee and Tidor, J. Chem. Phys. 106:8681- 

30 8690, 1997; Kangas and Tidor, J. Chem. Phys. 109:7522-7545, 1998). Then, one or more 
CDR mutations (i.e. 9 modifications) is subjected to further computational analysis. Based 
on these calculations, the binding affinity is then determined for a subset of modified 
antibodies having one or more modifications according to the rules of the invention. 

Using a continuum electrostatics model, an electrostatic charge optimization can 

35 be performed on each side chain of the amino acids in the CDRs of the antibody. A 
charge optimization gives charges at atom centers but does not always yield actual 
mutation(s). Accordingly, a round of charge optimizations can be performed with various 
constraints imposed to represent natural side chain characteristics at the positions of 
interest. For example, an optimization can be performed for a net side chain charge of -1, 
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0, and +1 with the additional constraint that no atom's charge exceeded a particular value, 
e.g., 0.85 electron charge units. Candidate amino acid side chain positions, and residue 
modifications at these positions, are then determined based on the potential gain in 
electrostatic binding free energy observed in the optimizations. 

Binding free energy difference (in kcal/mol) in going from the native residue to a 
completely uncharged sidechain isostere, i.e., a residue with the same shape but no 
charges or partial charges on the atoms can be calculated. Negative numbers indicate a 
predicted increase of binding affinity. Optimal charge distribution wherein the net side 
chain charge is +1, 0, or -1 can be used to calculate the binding free energy difference. 

In those instances in which binding free energy difference is favorable (AG < - 
0.25 kcal/mol) and associated with a transition from the native residue to a completely 
uncharged side chain isostere, i.e., a residue with the same shape but no charges or partial 
charges on the atoms, modifications from the set of amino acids with nonpolar sidechains, 
e.g., Ala, Cys, He, Leu, Met, Phe, Pro, Val are selected. 

Where the binding free energy difference that can be obtained with an optimal 
charge distribution in the side chain and a net side chain charge of-1 is favorable (AG < - 
0.25 kcal/mol), modifications from the set of amino acids with negatively charged side 
chains, e.g., Asp, Glu are selected. 

Similarly, where the binding free energy difference that can be obtained with an 
optimal charge distribution in the side chain and a net side chain charge of +1 is favorable 
(AG < -0.25 kcal/mol), modifications from the set of amino acids with positively charged 
sidechains, e.g., Arg, His, Lys are selected. 

Finally, in those cases where the binding free energy difference that can be 
obtained with an optimal charge distribution in the side chain and a net side chain charge 
of 0 is favorable (AG < -0.25 kcal/mol), modifications from the set of amino acids with 
uncharged polar sidechains, e.g., Asn, Cys, Gin, Gly, His, Met, Phe, Ser, Thr, Trp, Tyr, to 
which are added Cys, Gly, Met and Phe are selected. 

As described herein, the designed modified antibodies can be built in silico and 
the binding energy recalculated. Modified side chains can be built by performing a 
rotamer dihedral scan in CHARMM, using dihedral angle increments of 60 degrees, to 
determine the most desirable position for each side chain. Binding energies are then 
calculated for the wild type (parent) and mutant (modified) complexes using the Poisson- 
Boltzmann electrostatic energy and additional terms for the van der Waals energy and 
buried surface area. 

Results from these computational modification calculations are then reevaluated 
as needed, for example, after subsequent reiterations of the method either in silico or 
informed by additional experimental structural/functional data. 

The rules allow for several predictions to be made which can be categorized as 
follows: 
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1) modifications at the interaction interface involving residues on the antibody 
that become partially buried upon binding (interactions are improved by making 
hydrogen bonds with the antigen); 

2) modifications of polar residues on the antibody that become buried upon 
5 binding and thus pay a desolvation penalty but do not make any direct electrostatic 

interactions with the antigen (improvements are usually made by modifying to a 
hydrophobic residue with similar shape to the wild-type residue or by adding a residue 
that can make favorable electrostatic interactions); and 

3) modifications of surface residues on the antibody that are in regions of 

10 uncomplementary potentials. These modifications are believed to improve long-range 
electrostatic interactions between the antibody and antigen without perturbing packing 
interactions at the binding interface. 

Thus practiced, the rules of the invention allow for the successful prediction of 
affinity altering, e.g., enhancing, side chain modifications. These findings can be 

1 5 classified into three general classes of modifications. The first type of modification 

involves residues at the interface across from a charged group on the antigen capable of 
making a hydrogen bond; the second type involves buried polar residues that pay a 
desolvation penalty upon binding but do not make back electrostatic interactions; and the 
third type involves long-range electrostatic interactions. 

20 The first type of modification is determined by inspection of basic 

physical/chemical considerations, as these residues essentially make hydrogen bonds with 
unsatisfied hydrogen partners of the antigen. Unlike other methods, the rules of the 
invention allowed for surprising residue modifications in which the cost of desolvation is 
allowed to outweigh the beneficial interaction energy. 

25 The second type of modification represents still another set of modifications, as 

the energy gained is primarily a result of eliminating an unfavorable desolvation while 
maintaining non-polar interactions. 

The third type of modification concerns long-range interactions that show 
potential for significant gain in affinity. These types of modifications are particularly 

30 interesting because they do not make direct contacts with the antigen and, therefore, pose 
less of a perturbation in the delicate interactions at the antibody-antigen interface. 

Accordingly, when the desired side chain chemistries are determined for the 
candidate amino acid position(s) according to the rules, the residue position(s) is then 
modified or altered, e.g., by substitution, insertion, or deletion, as further described 

35 herein. 

In addition to the above rules for antibody modification, it is noted that certain 
determinations, e.g., solvent effects can be factored into initial (and subsequent) 
calculations of optimal charge distributions. 
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Obtaining an Antibody or Antigen-Binding Fragment Thereof 

The methods of the invention that are aimed at generating a non-naturally 
occurring antibody (or an antigen-binding fragment thereof) can, but do not necessarily, 
begin by obtaining an antibody. That antibody may be referred to herein as a "parent" 

5 antibody or sometimes as a "first" antibody, and it can be used to obtain information that 
will allow one to modify or alter one or more amino acid residues either within that 
antibody {i.e., within the parent antibody) or within a modified or altered antibody having 
a sequence that is similar to, or that contains portions of, the sequence of the parent 
antibody. As described herein, for example, one or more of the CDRs (or portions 

1 0 thereof) of a parent antibody, can be replaced with the corresponding CDR(s) of the 

modified antibody by standard genetic engineering techniques to accomplish the so-cialled 
CDR graft or transplant. Accordingly, the method can begin with a mammalian 
monoclonal or polyclonal antibody (e.g., murine or primate), chimeric, CDR-grafted, 
humanized, or human antibody. 

1 5 The parent antibodies can be obtained from art-recognized sources or produced 

according to art-recognized technologies. For example, the parent antibody can be a 
CDR-grafted or humanized antibody having CDR regions derived from another source or 

species, e.g., murine. 

The parent antibody or any of the 'modified antibodies of the invention can be in 

20 the format of a monoclonal antibody. Methods for producing monoclonal antibodies are 
known in the art (see, e.g. , Kohler and Milstein, Nature 256:495-497, 1975), as well as 
techniques for stably introducing immunoglobulin-encoding DNA into myeloma cells 
(see, e.g., Oi et al, Proc. Natl. Acad. Sci. USA 80:825-829, 1983; Neuberger, EMBO J. 
2:1373-1378, 1983; and Ochi etal,Proc. Natl Acad. Sci. USA 80:6351-6355, 1983). 

25 These techniques, which include in vitro mutagenesis and DNA transfection, allow for the 
construction of recombinant immunoglobulins; these techniques can be used to produce 
the parent and modified antibodies used in the methods of the invention or to produce the 
modified antibodies that result from those methods. Alternatively, the parent antibodies 
can be obtained from a commercial supplier. Antibody fragments (scFvs and Fabs) can 

30 also be produced in E. coli (production methods and cellular hosts are described further 
below). 

The parent antibody or any of the modified antibodies of the invention can be an 
antibody of the IgA, IgD, IgE, IgG, or IgM class. 

As noted above, the methods of the invention can be applied to more than just 
35 tetrameric antibodies (e.g., antibodies having the structure of an immunoglobulin of the 
G class (an IgG)). For example, the methods of modifying an antibody can be carried out 
with antigen-binding fragments of any antibody as well. The fragments can be 
recombinantly produced and engineered, synthesized, or produced by digesting an 
antibody with a proteolytic enzyme. For example, the fragment can be an Fab fragment; 
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digestion with papain breaks the antibody at the region, before the inter-chain (i.e., Vh- 
V h ) disulphide bond, that joins the two heavy chains. This results in the formation of two 
identical fragments that contain the light chain and the V H and ChI domains of the heavy 
chain. Alternatively, the fragment can be an F(ab')2 fragment. These fragments can be 
created by digesting an antibody with pepsin, which cleaves the heavy chain after the 
inter-chain disulphide bond, and results in a fragment that contains both antigen-binding 
sites. Yet another alternative is to use a "single chain" antibody. Single-chain Fv (scFv) 
fragments can be constructed in a variety of ways. For example, the C-terminus of Vh 
can be linked to the N-terminus of V L . Typically, a linker (e.g., (GGGGS) 4 ) is placed 
between Vh and Vl> However, the order in which the chains can be linked can be 
reversed, and tags that facilitate detection or purification (e.g., Myc-, His-, or FL AG-tags) 
can be included (tags such as these can be appended to any antibody or antibody fragment 
of the invention; their use is not restricted to scFv). Accordingly, and as noted below, 
tagged antibodies are within the scope of the present invention. In alternative 
embodiments, the antibodies used in the methods described herein, or generated by those 
methods, can be heavy chain dimers or light chain dimers. Still further, an antibody light 
or heavy chain, or portions thereof, for example, a single domain antibody (DAb), can be 
used. 

As the methods of the invention can be iterative, the parent antibody may not be a 
naturally occurring antibody. As the process of modifying an antibody can be repeated as 
many times as necessary, the starting antibody (or antigen-binding fragment thereof) can 
be wholly non-human or an antibody containing human FRs and non-human (e.g., 
murine) CDRs. That is, the "parent" antibody can be a CDR-grafted antibody that is 
subjected to the methods of the invention in order to improve the affinity of the antibody, 
i.e., affinity mature the antibody. As noted above, the affinity may only be improved to 
the extent that it is about the same as (or not significantly worse than) the affinity of the 
naturally occurring human antibody (the FR-donor) for its antigen. Thus, the "parent" 
antibody may, instead, be an antibody created by one or more earlier rounds of 
modification, including an antibody that contains sequences of more than one species 
(e.g. , human FRs and non-human CDRs). The methods of the invention encompass the 
use of a "parent" antibody that includes one or more CDRs from a non-human (e.g., 
murine) antibody and the FRs of a human antibody. Alternatively, the parent antibody 
can be completely human. 

Where the structure is available, of course, one may begin the computational 
analysis with that structure (rather than creating it again). 

The Method of the Invention Informed by Antibody-Antigen Structural Data 

Proteins are known to fold into three-dimensional structures that are dictated by 
the sequences of their amino acids and by the solvent in which a given protein (or protein- 
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containing complex) is provided. The three-dimensional structure of a protein influences 
its biological activity and stability, and that structure can be determined or predicted in a 
number of ways. Generally, empirical methods use physical biochemical analysis. 
Alternatively, tertiary structure can be predicted using model building of three- 

5 dimensional structures of one or more homologous proteins (or protein complexes) that 
have a known three-dimensional structure. X-ray crystallography is perhaps the best- 
known way of determining protein structure (accordingly, the term "crystal structure" 
may be used in place of the term "structure"), but estimates can also be made using 
circular dichroism, light scattering, or by measuring the absorption and emission of 

10 radiant energy. Other useful techniques include neutron diffraction and nuclear magnetic 
resonance (NMR). All of these methods are known to those of ordinary skill in the art, 
and they have been well described in standard textbooks {see, e.g., Physical Chemistry, 
4th Ed., WJ. Moore, Prentiss-Hall, N.J., 1972, or Physical Biochemistry, K.E. Van 
Holde, Prentiss-Hall, N.J., 1971)) and numerous publications. Any of these techniques 

15 can be carried out to determine the structure of an antibody, or antibody -antigen- 
containing complex, which can then be analyzed according to the methods of the present 
invention and, e.g. , used to inform one or more steps of the method of the invention. 

Similarly, these and like methods can be used to obtain the structure of an antigen 
bound to an antibody fragment, including a fragment consisting of, e.g. , a single-chain 

20 antibody, Fab fragment, etc. Methods for forming crystals of an antibody, an antibody 
fragment, or scFv-antigen complex have been reported by, for example, van den Elsen 
et al (Proc. Natl Acad. Set USA 96:13679-13684, 1999, which is expressly incorporated • 
by reference herein). 

25 Computational Analysis 

The basic computational formulae used in carrying out the methods of the 
invention are provided in, e.g., U.S. Patent No. 6,230,102, the contents of which are 
hereby incorporated by reference in the present application in their entirety. 

As noted above, antibodies are altered (or "modified") according to the results of a 

30 computational analysis of electrostatic forces between the antibody and an antigen to 
which it binds, preferably, in accordance to the discrete criteria or rules of the invention 
described herein. The computational analysis allows one to predict the optimal charge 
distribution within the antibody, and one way to represent the charge distribution in a 
computer system is as a set of multipoles. Alternatively, the charge distribution can be 

35 represented by a set of point charges located at the positions of the atoms of the antibody. 
Once a charge distribution is determined (preferably, an optimal charge distribution), one 
can modify the antibody to match, or better match, that charge distribution. 

The computational analysis can be mediated by a computer-implemented process 
that carries out the calculations described in U.S. Patent No. 6,230,102. The computer 
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program is adapted herein to consider the real world context of antigen-antibody binding 
(and unlike other methods, this methods of the invention take into account, e.g., solvent, 
long-range electrostatics, and dielectric effects in the binding between an antibody and its 
antigen in a solvent). The process is used to identify modifications to the antibody 
5 structure that will achieve a charge distribution on the "matured" antibody that minimizes 
the electrostatic contribution to binding free energy between the matured antibody and its 
antigen (compared to that of the unmodified ("starting" or "parent") antibody. As is 
typical, the computer system (or device(s)) that performs the operations described here 
(and in more detail in U.S. Patent No. 6,230,102) will include an output device that 

10 displays information to a user (e.g., a CRT display, an LCD, a printer, a communication 
device such as a modem, audio output, and the like). In addition, instructions for carrying 
out the method, in part or in whole, can be conferred to a medium suitable for use in an 
electronic device for carrying out the instructions. Thus, the methods of the invention are 
amendable to a high throughput approach comprising software (e.g., computer-readable 

15 instructions) and hardware (e.g., computers, robotics, and chips). The computer- 
implemented process is not limited to a particular computer platform, particular 
processor, or particular high-level programming language. 

A useful process is set forth in Appendix A (U.S. Patent No. 6,230,102) and a 
more detailed exposition is provided in Appendix B (Lee and Tidor (J. Chem. Phys. 

20 106 :8681-8690. 1 997; each of which is expressly incorporated herein by reference). * 

Analysis of Affinity 

Affinity, avidity, and/or specificity can be measured in a variety of ways. 
Generally, and regardless of the precise manner in which affinity is defined or measured, 

25 the methods of the invention improve antibody affinity when they generate an antibody 
that is superior in any aspect of its clinical application to the antibody (or antibodies) 
from which it was made (for example, the methods of the invention are considered 
effective or successful when a modified antibody can be administered at a lower dose or 
less frequently or by a more convenient route of administration than an antibody (or 

30 antibodies) from which it was made). 

More specifically, the affinity between an antibody and an antigen to which it 
binds can be measured by various assays, including, e.g., a. BiaCore assay or the 
KinExA™ 3000 assay (available from Sapidyne Instruments (Boise, ID)). The latter 
assay was used to measure the affinity of AQC2 scFv mutants for the VLA1 1 domain 

35 (see the Examples below). Briefly, sepharose beads are coated with antigen (in the 
Examples below, the antigen is a VLA1 1-domain protein, but the antigen used in the 
methods of the invention can be any antigen of interest (e.g., a cancer antigen; a cell 
surface protein or secreted protein; an antigen of a pathogen (e.g., a bacterial or viral 
antigen (e.g., an HIV antigen, an influenza antigen, or a hepatitis antigen)), or an allergen) 
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by covalent attachment. (It is understood, however, that the methods described here are 
generally applicable; they are not limited to the production of antibodies that bind any 
particular antigen or class of antigens.) 

Those of ordinary skill in the art will recognize that determining affinity is not 
5 always as simple as looking at a single, bottom-line figure. Since antibodies have two 
arms, their apparent affinity is usually much higher than the intrinsic affinity between the 
variable region and the antigen (this is believed to be due to avidity). Intrinsic affinity 
can be measured using scFv or Fab fragments. 

1 o Chimeric Antibodies and Antibody Fragments 

The term "chimeric antibody" is used to describe a protein comprising at least an 
antigen-binding portion of an immunoglobulin molecule that is attached by, for example, 
a peptide bond or peptide linker, to a heterologous protein or a peptide thereof. The 
"heterologous 55 protein can be a non-immunoglobulin or a portion of an immunoglobulin 

15 of a different species, class or subclass. 

There are numerous processes by which such antibodies can be made. For 
example, one can prepare an expression vector including a promoter that is operably 
linked to a DNA sequence that encodes at least V H or V L and a sequence that encodes the 
heterologous protein (or a peptide thereof (the peptide being of a sufficient length that it 

20 can be recognized as a non-immunoglobulin molecule (i. e. , a peptide having no 

substantial sequence identity to an immunoglobulin))). If necessary, or desired, one can 
prepare a second expression vector including a promoter that is operably linked to a DNA 
sequence that encodes the complementary variable domain {I e. , where the parent 
expression vector encodes V H , the second expression vector encodes V L and vice versa). 

25 A cell line {e.g. , an immortalized mammalian cell line) can then be transformed with one 
or both of the expression vectors and cultured under conditions that permit expression of 
the chimeric variable domain or chimeric antibody {see, e.g., International Patent 
Application No. PCT/GB85/00392 to Neuberger et. ai). While Neuberger et al. 
produced chimeric antibodies in which complete variable domains were encoded by the 

30 parent expression vector, this method can be used to express the modified antibodies of 
the present invention, antibodies containing full-length heavy and light chains, or 
fragments thereof (e.g., the Fab, F(ab 5 ) 2 , or scFv fragments described herein). The 
methods are not limited to expression of chimeric antibodies. 

The antibodies produced by the methods described herein can be labeled just as 

35 any other antibody can be labeled. Accordingly, the invention encompasses antibodies 
produced by the present methods that are labeled with detectable labels such as a 
radioactive label {e.g., P 32 or S 35 ), an enzyme {e.g., horseradish peroxidase, 
chloramphenicol acetyltransferase (CAT), P-gaiactosidase (p-gal), or the like), a 
chromophore or a fluorophore including a quantum dot. The labeled antibodies can be 
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used to carry out diagnostic procedures (many diagnostic assays rely on detection of a 
protein antigen (such as PSA)) in a variety of cell or tissue types. For imaging 
procedures, in vitro or in vivo, the altered antibodies produced by the methods described 
herein can be labeled with additional agents, such as NMR contrasting agents, X-ray 
contrasting agents, or quantum dots. Methods for attaching a detectable agent to 
polypeptides, including antibodies or fragments thereof, are known in the art. The 
antibodies can also be attached to an insoluble support (such as a bead, a glass or plastic 
slide, or the like). 



1 o Construction of Modified Antibodies 

Once the sequence of an antibody (e.g., a CDR-grafted or otherwise modified or 
"humanized" antibody) has been decided upon, that antibody can be made by techniques 
well known in the art of molecular biology. More specifically, recombinant DNA 
techniques can be used to produce a wide range of polypeptides by transforming a host 

15 cell with a nucleic acid sequence (e.g. , a DNA sequence that encodes the desired protein 
products (e.g., a modified heavy or light chain; the variable domains thereof, or other 
antigen-binding fragments thereof)). 

More specifically, the methods of production can be carried out as described 
above for chimeric antibodies. The DNA sequence encoding, for example, an altered 

20 variable domain can be prepared by oligonucleotide synthesis. The variable domain can 
be one that includes the FRs of a human acceptor molecule and the CDRs of a donor, e.g. , 
murine, either before or after one or more of the residues (e.g., a residue within a CDR) \ 
has been modified to facilitate antigen binding. This is facilitated by determining the , 
framework region sequence of the acceptor antibody and at least the CDR sequences of 

25 the donor antibody. Alternatively, the DNA sequence encoding the altered variable 
domain may be prepared by primer directed oligonucleotide site-directed mutagenesis. 
This technique involves hybridizing an oligonucleotide coding for a desired mutation 
with a single strand of DNA containing the mutation point and using the single strand as a 
template for extension of the oligonucleotide to produce a strand containing the mutation. 

30 This technique, in various forms, is described by, e.g., Zoller and Smith (Nuc. Acids Res. 
10:6487-6500, 1982), Norris et al. (Nuc. Acids Res. 11:5103-51 12, 1983), Zoller and 
Smith (DNA 3:479-488, 1984), and Kramer et al. (Nuc. Acids Res. 10:6475-6485, 1982). 

Other methods of introducing mutations into a sequence are known as well and 
can be used to generate the altered antibodies described herein (see, e.g. , Carter et al, 

35 Nuc. Acids Res. 13:443 1-4443, 1985). The oligonucleotides used for site-directed 

mutagenesis can be prepared by oligonucleotide synthesis or isolated from DNA coding 
for the variable domain of the donor antibody by use of suitable restriction enzymes. 

Host Cells and Cell Lines for Expression of the Modified Antibodies 
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Either the parent antibodies or modified antibodies as described herein (whether in 
a final form or an intermediate form) can be expressed by host cells or cell lines in 
culture. They can also be expressed in cells in vivo. The cell line that is transformed 
(e.g., transfected) to produce the altered antibody can be an immortalised mammalian cell 
5 line, such as those of lymphoid origin (e.g., a myeloma, hybridoma, trioma or quadroma 
cell line). The cell line can also include normal lymphoid cells, such as B-cells, that have 
been immortalized by transformation with a virus (e.g., the Epstein-Barr virus). 

Although typically the cell line used to produce the altered antibody is a 
mammalian cell line, cell lines from other sources (such as bacteria and yeast) can also be 
10 used. In particular, E. co/z-derived bacterial strains can be used, especially, e.g. , phage 
display. 

Some immortalized lymphoid cell lines, such as myeloma cell lines, in their 
normal state, secrete isolated Ig light or heavy chains. If such a cell line is transformed 
with a vector that expresses an altered antibody, prepared during the process of the 
15 invention, it will not be necessary to carry out the remaining steps of the process, 

provided that the normally secreted chain is complementary to the variable domain of the 
Ig chain encoded by the vector prepared earlier. 

If the immortalised cell line does not secrete or does not secrete a complementary 
chain, it will be necessary to introduce into the cells a vector that encodes the appropriate 
20 complementary chain or fragment thereof. 

In the case where the immortalised cell line secretes a complementary light or 
heavy chain, the transformed cell line may be produced for example by transforming a 
suitable bacterial cell with the vector and then fusing the bacterial cell with the 
immortalised cell line (e.g. , by spheroplast fusion). Alternatively, the DNA may be 
25 directly introduced into the immortalised cell line by electroporation. 

Tharmaceutical Formulations and Their Uses 

In prophylactic applications, pharmaceutical compositions or medicaments are 
administered to a subject suffering from a disorder in an amount sufficient to eliminate or 

30 reduce the risk, lessen the severity, or delay the outset of the disorder, including 

biochemical, histologic and/or behavioral symptoms of the disorder, its complications and 
intermediate pathological phenotypes presenting during development of the disorder. In 
therapeutic applications, compositions or medicaments are administered to a subject 
suspected of, or already suffering from such a disorder in an amount sufficient to cure, or 

35 at least partially arrest, the symptoms of the disorder (biochemical, histologic and/or 
behavioral), including its complications and intermediate pathological phenotypes in 
development of the disorder. 

Effective doses of the compositions of the present invention, for the treatment of a 
condition vary depending upon many different factors, including means of administration, 
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target site, physiological state of the subject, whether the subject is human or an animal, 
other medications administered, and whether treatment is prophylactic or therapeutic. 

■ 

Usually, the subject is a human but non-human mammals including transgenic mammals 
can also be treated. 

5 For passive immunization with an antibody, the dosage ranges from about 0.000 1 

to 1 00 mg/kg, and more usually 0.01 to 20 mg/kg, of the host body weight. For example 
dosages can be 1 mg/kg body weight or 10 mg/kg body weight or within the range of 1- 
10 mg/kg, e.g., at least 1 mg/kg. Subjects can be administered such doses daily, on 
alternative days, weekly or according to any other schedule determined by empirical 

10 analysis. An exemplary treatment entails administration in multiple dosages over a 
prolonged period, for example, of at least six months. Additional exemplary treatment 
regimes entail administration once per every two weeks or once a month or once every 3 
to 6 months. Exemplary dosage schedules include 1-10 mg/kg or 15 mg/kg on 
consecutive days, 30 mg/kg on alternate days or 60 mg/kg weekly. In some methods, two 

1 5 or more monoclonal antibodies with different binding specificities are administered 

simultaneously, in which case the dosage of each antibody administered falls within the 
ranges indicated. 

Antibody is usually administered on multiple occasions. Intervals between single 
dosages can be weekly, monthly or yearly. In some methods, dosage is adjusted to 

20 achieve a plasma antibody concentration of 1-1000 mg/ml and in some methods 25-300 
Hg/ml. Alternatively, antibody can be administered as a sustained release formulation, in 
which case less frequent administration is required. Dosage and frequency vary 
depending on the half-life of the antibody in the subject. In general, human antibodies 
show the longest half-life, followed by humanized antibodies, chimeric antibodies, and 

25 nonhuman antibodies, in descending order. 

The dosage and frequency of administration can vary depending on whether the 
treatment is prophylactic or therapeutic. In prophylactic applications, compositions 
containing the present antibodies or a cocktail thereof are administered to a subject not 
already in the disease state to enhance the subject's resistance. Such an amount is defined 

30 to be a "prophylactic effective dose." In this use, the precise amounts again depend upon 
the subject's state of health and general immunity, but generally range from 0.1 to 25 mg 
per dose, especially 0.5 to 2.5 mg per dose. A relatively low dosage is administered at 
relatively infrequent intervals over a long period of time. Some subjects continue to 
receive treatment for the rest of their lives. 

35 In therapeutic applications, a relatively high dosage (e.g., from about 1 to 200 mg 

of antibody per dose, with dosages of from 5 to 25 mg being more commonly used) at 
relatively short intervals is sometimes required until progression of the disease is reduced 
or terminated, and preferably until the subject shows partial or complete amelioration of 
symptoms of disease. Thereafter, the patent can be administered a prophylactic regime. 

-23- 



WO 2005/01 1376 PCT/US2004/024200 

Therapeutic agents can be administered by parenteral, topical, intravenous, oral, 
subcutaneous, intraarterial, intracranial, intraperitoneal, intranasal or intramuscular means 
for prophylactic and/or therapeutic treatment. The most typical route of administration of 
a protein drug is intravascular, subcutaneous, or intramuscular, although other routes can 
5 be effective. In some methods, agents are injected directly into a particular tissue where 
deposits have accumulated, for example intracranial injection. In some methods, 
antibodies are administered as a sustained release composition or device, such as a 
Medipad™ device. The protein drug can also be administered via the respiratory tract, 
e.g. , using a dry powder inhalation device. 
10 Agents of the invention can optionally be administered in combination with other 

agents that are at least partly effective in treatment of immune disorders. 

The pharmaceutical compositions of the invention include at least one antibody of 
the invention in a pharmaceutically acceptable carrier. A "pharmaceutically acceptable 
carrier" refers to at least one component of a pharmaceutical preparation that is normally 
15 used for administration of active ingredients. As such, a carrier may contain any 

pharmaceutical excipient used in the art and any form of vehicle for administration. The 
compositions may be, for example, injectable solutions, aqueous suspensions or solutions, 
non-aqueous suspensions or solutions, solid and liquid oral formulations 5 salves, gels, 
ointments, intradermal patches, creams, lotions, tablets, capsules, sustained release 
20 formulations, and the like. Additional excipients may include, for example, colorants, 
taste-masking agents, solubility aids,, suspension agents, compressing agents, enteric 
coatings, sustained release aids, and the like. 

Agents of the invention are often administered as pharmaceutical compositions 
including an active therapeutic agent and a variety of other pharmaceutically acceptable 
25 components. See Remington ! s Pharmaceutical Science (15th ed., Mack Publishing 

Company, Easton, Pennsylvania (1980)). The preferred form depends on the intended 
mode of administration and therapeutic application. The compositions can also include, 
depending on the formulation desired, pharmaceutically acceptable, non-toxic carriers or 
diluents, which are defined as vehicles commonly used to formulate pharmaceutical 
30 compositions for animal or human administration. The diluent is selected so as not to 
affect the biological activity of the combination. Examples of such diluents are distilled 
water, physiological phosphate-buffered saline, Ringer ! s solutions, dextrose solution, and 
Hank's solution. In addition, the pharmaceutical composition or formulation may also 
include other carriers, adjuvants, or nontoxic, nontherapeutic, nonimmuno genie 
35 stabilizers and the like. 

Antibodies can be administered in the form of a depot injection or implant 
preparation, which can be formulated in such a manner as to permit a sustained release of 
the active ingredient. An exemplary composition comprises monoclonal antibody at 5 
mg/ml, formulated in aqueous buffer consisting of 50 mM L-histidine, 150 mMNaCl, 
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adjusted to pH 6.0 with HC1. Another example of a suitable formulation buffer for 
monoclonal antibodies contains 20 mM sodium citrate, pH 6.0, 10% sucrose, 0.1% 
Tween 80. 

Typically, compositions are prepared as injectables, either as liquid solutions or 
5 suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to 
injection can also be prepared. The preparation also can be emulsified or encapsulated in 
liposomes or microparticles such as polylactide, polyglycolide, or copolymer for 
enhanced adjuvant effect, as discussed above (see Langer, Science 249: 1527 (1990) and 
Hanes, Advanced Drug Delivery Reviews 28:97 (1997)). 

10 

Therapies 

Treatment of a subject suffering from a disease or disorder can be monitored using 
standard methods. Some methods entail determining a baseline value, for example, of an 
antibody level or profile in a subject, before administering a dosage of agent, and 

15 comparing this with a value for the profile or level after treatment. A significant increase 
{i.e., greater than the typical margin of experimental error in repeat measurements of the 
same sample, expressed as one standard deviation from the . mean of such measurements) 
in value of the level or profile signals a positive treatment outcome {i.e. , that 
administration of the agent has achieved a desired response). If the value for immune 

20 response does not change significantly, or decreases, a negative treatment outcome is 
indicated. 

In other methods, a control value {i.e., a mean and standard deviation) of level or 
profile is determined for a control population. Typically the individuals in the control 
population have not received prior treatment. Measured values of the level or profile in a 

25 subject after administering a therapeutic agent are then compared with the control value. 
A significant increase relative to the control value {e.g., greater than one standard 
deviation from the mean) signals a positive or sufficient treatment outcome. A lack of 
significant increase or a decrease signals a negative or insufficient treatment outcome. 
Administration of agent is generally continued while the level is increasing relative to the 

30 control value. As before, attainment of a plateau relative to control values is an indicator 
that the administration of treatment can be discontinued or reduced in dosage and/or 
frequency. 

In other methods, a control value of the level or profile {e.g., a mean and standard 
deviation) is determined from a control population of individuals who have undergone 
35 treatment with a therapeutic agent and whose levels or profiles have plateaued in response 
to treatment. Measured values of levels or profiles in a subject are compared with the 
control value. If the measured level in a subject is not significantly different {e.g., more 
than one standard deviation) from the control value, treatment can be discontinued. If the 
level in a subject is significantly below the control value, continued administration of 

m 
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agent is warranted. If the level in the subject persists below the control value, then a 
change in treatment may be indicated. 

In other methods, a subject who is not presently receiving treatment but has 
undergone a previous course of treatment is monitored for antibody levels or profiles to 

5 determine whether a resumption of treatment is required. The measured level or profile 
in the subject can be compared with a value previously achieved in the subject after a 
previous course of treatment. A significant decrease relative to the previous measurement 
(i.e., greater than a typical margin of error in repeat measurements of the same sample) is 
an indication that treatment can be resumed. Alternatively, the value measured in a 

10 subject can be compared with a control value (mean plus standard deviation) determined 
in a population of subjects after undergoing a course of treatment. Alternatively, the 
measured value in a subject can be compared with a control value in populations of 
prophylactically treated subjects who remain free of symptoms of disease, or populations 
of therapeutically treated subjects who show amelioration of disease characteristics. In 

15 all of these cases, a significant decrease relative to the control level (i e. 9 more than a 
standard deviation) is an indicator that treatment should be resumed in a subject. 

The antibody profile following administration typically shows an immediate peak 
in antibody concentration followed by an exponential decay. Without a further dosage, 
the decay approaches pretreatment levels within a period of days to months depending on 

20 the half-life of the antibody administered. For example the half-life of some human • > 

antibodies is of the order of 20 days. ; 

In some methods, a baseline measurement of antibody to a given antigen in the -;. 
subject is made before administration, a second measurement is made soon thereafter to v. 
determine the peak antibody level, and one or more further measurements are made at 

25 intervals to monitor decay of antibody levels. When the level of antibody has declined to 
baseline or a predetermined percentage of the peak less baseline (e.g., 50%, 25% or 10%), 
administration of a further dosage of antibody is administered. In some methods, peak or 
subsequent measured levels less background are compared with reference levels 
previously determined to constitute a beneficial prophylactic or therapeutic treatment 

30 regime in other subjects: If the measured antibody level is significantly less than a 
reference level (e.g., less than the mean minus one standard deviation of the reference 
value in population of subjects benefiting from treatment) administration of an additional 
dosage of antibody is indicated. 

Additional methods include monitoring, over the course of treatment, any art- 

35 recognized physiologic symptom (e.g. , physical or mental symptom) routinely relied on 
by researchers or physicians to diagnose or monitor disorders. 

The following examples are included for purposes of illustration and should not be 
construed as limiting the invention. 
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Exemplification 

Throughout the examples, the following materials and methods were used unless 
otherwise stated. 

Materials and Methods 

In general, the practice of the present invention employs, unless otherwise 
indicated, conventional techniques of chemistry, molecular biology, recombinant DNA 
technology, immunology (especially, e.g., antibody technology), and standard techniques 
in electrophoresis. See, e.g., Sambrook, Fritsch and Maniatis, Molecular Cloning: Cold 
Spring Harbor Laboratory Press (1989); Antibody Engineering Protocols (Methods in 
Molecular Biology), 510, Paul, S., Humana Pr (1996); Antibody Engineering: A Practical 
Approach (Practical Approach Series, 169), McCafferty, Ed., Irl Pr (1996); Antibodies: A 
Laboratory Manual, Harlow et al, C.S.H.L. Press, Pub. (1999); and Current Protocols in 
Molecular Biology, eds. Ausubel et al , John Wiley & Sons (1 992). 

Generation of Antibodies and Antigen-Binding Fragments Thereof 

The selection, cloning, and manufacture of antibodies, for example, chimeric, 
humanized, monoclonal, and single-chain antibodies is well described in the art. In 
addition, the humanization of hu5c8 mAb has been described previously. See Lederman, 
1992 and Karpusas, 2001, respectively. This antibody is available from the ATCC (PTA- 
4931). The 5c8 antibody was stably expressed in NS0 myeloma cells and purified by 
Protein A and gel filtration chromatography. SDS-PAGE and analytical gel filtration 
chromatography demonstrated that the protein formed the expected disulfide linked 
tetramer. The single-chain antibodies of the invention were typically expressed in E. coli 
and immunopurified using standard techniques. 

AQC2 scFv production 

AQC2 scFv is expressed by plasmid pKJS217. This plasmid contains 318 
nucleotides of the AQC2 light chain encoding the 106 amino acid light chain variable 
region followed in frame by 45 nucleotides encoding 3 copies of a GGGGS linker moiety. 
The linker is followed in frame by 360 nucleotides encoding the 120 amino acid AQC2 
heavy chain variable region. Immediately following the heavy chain variable region is an 
enterokinase cleavage site and myc and HIS tags. Expression was done in Kcoli and is 
driven by the ara-BAD promoter and the protein is directed to the periplasmic space by an 
80 nucleotide fragment encoding the gl 1 1 peptide from the bacteriophage fd. This 
peptide was cleaved from the protein during periplasmic export. 
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SC8 Fab production 

The 5C8 Fab fragment was expressed by the bicistronic plasmid pBEF064. The 
first cistron contains 354 nucleotides of the 5C8 heavy chain encoding the 118 amino acid 
5 heavy chain variable region followed in frame by 306 nucleotides encoding the first 1 02 
amino acids of the human IgGl constant domain and 18 nucleotides encoding a 6 
histidine tag. A second ribosome entry site is located 7 nucleotides after the end of the 
heavy chain cistron. The second cistron contains 333 nucleotides encoding the 1 1 1 amino 
acid 5C8 light chain variable region followed in frame by 321 nucleotides encoding the 
10 107 amino acid light chain constant domain. Expression was done in E.coli and is driven 
by the ara-BAD promoter and the heavy and light chains are directed to the periplasmic 
space by the OmpA (heavy chain) and PhoA (light chain) periplasmic localization 
signals. The periplasmic localization signals are cleaved from the protein during 

* 

periplasmic export. 

15 

Binding Assays 

Binding assays were typically performed using the KinExA™ kit. The assay is 
carried out by passing a dilute solution of the antibody (or antigen-binding fragment) 
through the column provided in the kit and some of the antibody (or the antigen-binding 
: 20 fragment thereof) interacts with the antigen on the bead. The antibody (or the fragment) 
is then detected with a secondary anti-human IgG heavy and light chain antibody 
conjugated with the fluorescent dye Cy5 (Jackson ImmunoResearch Laboratories, Inc., 
West Grove, PA). The concentration of the antibody (or the fragment) is set so that the 
signal from the fluorescent dye is proportional to the concentration of protein. To obtain 

25 the solution phase affinity of the interaction, the antibody (or the fragment) is mixed with 
a dilution series of soluble antigen. These proteins (antibody and antigen) are allowed to 
reach equilibrium during a three-hour incubation at room temperature or an overnight 
incubation at 4°C. The mixture is flowed over the antigen-containing column, and the 
signal is proportional to the amount of unbound antibody (of antibody fragment) that 

30 remains in solution. The resulting data can be plotted on a linear-log scale graph and fit 
to a quadratic curve by non-linear regression, which gives a value for the Kp. 

Binding assay SC8-CD40L 

An ELISA-based competitive binding assay was done. Anti c-myc mAb was 
35 coated onto NUNC Maxisorb plates at 1 0 ug/mL in PBS for 2 hrs at room temperature. 
Serial dilutions of unlabeled 5C8 Fab (mutants or wildtype) were made and mixed with 
equal volumes of fixed concentration (30 ng/ml) of biotin-labeled 5C8 Fab competitor, 
and added to the plate. After 2 hours incubation at room temperature, the plate was 
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washed and bound biotin-labeled 5C8 Fab competitor was detected with streptavidin- 
HRP. Binding affinities were obtained from four parameter curve fits. 

Computer Modeling Metrics and Formulae 

For carrying out the optimization of an antibody according to the invention, the 
following metrics and formulae can be used. For example, the free energy of binding 
difference between the electrostatic free energy in the bound and the unbound state of 
antibody can be represented as such, AG binding =G bwnd 'G unbound (see FIG. la). Because the 
dielectric model includes responses that affect the entropy as well as the enthalpy, the 
electrostatic energy is considered to be a free energy. The free energy of each state is 
expressed as a sum of coulombic and reaction-field (hydration) terms involving the 
antigen (L), the antibody (or antigen-binding fragment thereof) (R), and their interaction 
(L-R): 



This results in the following expression for the binding free energy, 

where the fact that the geometry of point charges in the antigen and antibody remain fixed 
is used in the model to cancel the coulombic self contribution of antibody and antigen and 
where the two L-R terms are due only to the bound state because the antibody and antigen 
are assumed not to interact in the unbound state. (Note, however, that the charge 
distribution for the antigen need not be the same in the bound and unbound states. If they 
are different, this adds a constant to AG binc n ng that can be dropped in defining AG var in Eq. 
(3)). Thus, Eq. (2) describes the electrostatic binding free energy as a sum of desolvation 
contributions of the antibody and the antigen (which are unfavorable) and solvent 
screened electrostatic interaction in the bound state (which is usually favorable). Since 
the goal is to vary the antibody charge distribution to optimize the electrostatic binding 
free energy and the last term simply adds a constant, a relevant variational binding energy 
is defined, 

in which the first two terms on the right hand side (RHS) of Eq. (2) have been combined 
into a screened interaction term and the constant term has been dropped. Note that 

and 
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where V L state is the total electrostatic potential in the indicated state due to the 
antibody charge distribution only and V,^ is the coulombic or reaction-field 
(hydration) term, as indicated. The summations are over atomic point charges in the 
antibody (ieL) or antigen j €R). The factor of 1/2 in Eq. (5) is due to the fact that the 

5 antibody charge distribution interacts with the self-induced reaction field. V C oui,L° und , 
Vhyd,L° und , and V jtydtL mbound , the three electrostatic potentials in Eqs. (4) and (5), are 
expressed in terms of the given geometry and charge distribution by solving the 
boundary-value problem shown in FIG. lb. A charge distribution (corresponding to the 
antibody) is embedded in a sphere of radius R. The center of the sphere is taken as the 

1 o origin of coordinates (unprimed) but the charge distribution in multipoles is expanded 
about a second origin (primed) translated a distance d along the z-axis, so that 

7(rA4>>-^t^M^^)+^(^W Oft 

i 

% 

The potential everywhere satisfies the Poisson equation. Inside the sphere, it may 
be written as, 

15 

where the first term on the RHS is the coulombic and the second is the reaction-field 
(hydration) potential, and the summation over i corresponds to the antibody point charges. 
Outside the sphere, the coulombic and reaction-field potential can be combined and 
written as, 

20 

where A/, w and B/ >OT are to be determined by the proper boundary conditions and Y/,, w (0, <|>) 
are the spherical harmonics. The coulombic term in Eq. (7) is expanded in spherical 
harmonics and multipoles of the charge distribution about the center of the sphere. Here 

25 the origin of the multipole expansion is shifted to d, 



(9) 



-VvJL^Mil no) 



mo w— / j 



where Q'/, m is a spherical multipole expanded about the primed origin, d, 



30 The definition of the Y/ fOT (0, <|)) used by Jackson is adopted (J, D. Jackson, 
Electrodynamics, 2nd ed, (John Wiley and Sons. New York. 1 975). 
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The expression in Eq. (10) is valid for r*>r'i (ie., outside the antibody or, more 

precisely, outside the sphere whose 

— * ^ 

center is at d and whose radius is the longest distance between d and a point charge). To 

5 substitute into Eq. (7) and combine terms involving spherical harmonics, first Y/ |W (G f , 
<)> f )/r ,/+7 of Eq. (10) was expanded in terms of Y,, w (9, <j))/r /+/ . This was done using the 
results of Greengard {The Rapid Evaluation of Potential Fields in Particle Systems MIT 
Press, Cambridge, Mass., 1988) which state that for r>d, 



CO /' 



Zj Zji /> ' ,tm Lf2/ , + lX2^ + 2/+l)J 



(12) 



r=0 m'=-/' 



10 



where 



. (/' +• W) ! (/' - iff) ! (/ + m) !(/ - m) F J ' 



(13) 



Since a geometry with 0^=0 has been used, only m'=0 terms in Eq. (12) are non^- 
15 vanishing, in which case Eq. (10) becomes, 

' ! t (14) 



/-0 r?i — / 



in which the multipole distribution was taken about the point d , but the potential was 
expressed as a summation of spherical harmonics about the large-sphere center. The 
20 above equation can also be written as, 



(15) 



where terms with the same Y/, OT (0, <)>) are grouped together, as opposed to Eq. (14), where 
terms with the same Q'*/,,/, are grouped. 

25 Upon substituting Eq. (15) into Eq. (7) and matching boundary conditions at r=R 

or room temperature, 
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VfcU* = YaJ^t (16) 



(17) 



the hydration (reaction-field) potential inside the sphere is, 



15 



00 / 



v lvd (r) (1S -' 



1 A 1 



Y/^*^W7zf*» (19) 



where 

Cf = r ^777 nr (20) 

The various V's can be rewritten, with their dependence on the Q 1 */,™ made 
explicit. Vcou!,L bound is given by Eq. (20), V M , jL w is given by Eq. (19) but rewritten so 
10 that the 

terms with the same Q'*i ttn are collected, and V byd / nbomd is given by Eq. (19) with R-a 
and d=0. 

v — ^ _ v V 4jr o» Y * iff - * ? (21> 



ix? I oo i (22) 

V" 1 V 1 f 4/r ^ 



j=0 m=-< t'-t 



oo / 4 > r N (23) 



Substituting into Eq. (4), the dependence of AG^r on the 
20 Q 1 */^ is made explicit, 

A<W =24^ ( ^ ) + ^ ,( ^ ) ] (24) 
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CO / 



10 



/=0 Wl=-/ jCff 



00 



(25) 



(26) 



where in the last line the element a/ j/w is defined, which is independent of the Q f */,/», to be 
the factor multiplying Q'*/, OT in Eq. (25). Each ai >m expresses the contribution of a 
multipole to AG^l-r and contains all information concerning the antigen charge 
distribution required to obtain AG var . For AG/^i it is useful to re-express Eq. (5) in terms 
of the Q'*/ j/w , the multipoles describing the antibody charge distribution, rather than the 
individual charges, qj. 

V( r ) is expanded around the center of the multipole expansion, d, 



15 



isL ■ icL 



(27) 



(28) 



It has been shown by Rose (M. E. Rose, J. Math. & Phys. 37, 215 (1958); M. E. Rose, 
Elementary Theory of Angular Momentum (John Wiley and Sons, New York, 1957)) that 
in spherical coordinates the expansion becomes, 



(29) 



. 20 where 



(30) 



and y/, m (V) is the operator obtained by replacing r with V. 



25 For positive m and when y/ >H j(V) operates on a solution of the Laplace equation (i. e. 



x'Yi^B, <|>) or Y/,,„(e, <|>)/r /+; ), it has been shown that, 
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(31) 



10 



-m 



. (2fl!r/2/+l\ 2 ,n li 

= yn[[^rJ g+w)!g . m)! J *F* 

for m s 0. 

The double factorial is defined as 



(22 + 1)!! =(2Z+1).(2Z-1).(2/-3)...3'1 (32) 
(2f+l)! 



(33) 



2'/! 



and the spherical partial derivatives are 



1 1 (34) 



To compute y /|TO (V) for negative m, the fact that Y/ f ^(0, ft=(-l) m Y*/ lWI (e, ft is 
used and the definitions of spherical partial derivatives in Eq. (34) to obtain, 



_ (2Z)![/27+l\ 2? ] 

y/,,,(v) - ^7r|| T J (/+/ „ )!( ^ /7l)! j 



i (35) 

v -l v 0 



For /is £ 0. 



The hydration energy of the bound antibody is then 

■ 

IV 1 t 1 4tt V"» n / 4tt \1/ An \S 

* 

(37) 



15 To evaluate y/,„,(V) in Eq. (37), Eq. (3 1) and the gradient formula are used (M. E. Rose, 
Elementary Theory of Angular Momentum (John Wiley and Sons, New York, 1957)) 



20 where 
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m'c|-l,0,ll 



{39) 



the C(T, 1,/; m-m\ m r ) are the vector addition (or Clebsch-Gordon) coefficients frequently 
encountered in the study of angular momentum shown in Table 1 (of Rose), and 



are spherical unit vectors, 

f t = -4=(*+ m, Li = -?=(*- '>')• #o = t 



V2 



V2 



(40) 



Accordingly, 
10 From Eqs. (38) through (41), 



15 



(41) 



V„(r^(8/j.)=(-l)^l(21+3)]^0-14,I;m+^-uy- 1 Y / . lT ,, I « J 

Using Table I, Eq. (31), and Eqs. (37) through (42), the following intermediate results are 
obtained, 



[21" a-W ±m)\{l" -m)l 
(2l» - 2/' + 2m' + 1 )(p -m - /' + ! 



^M 1 1 J" -firm' y , \ 



(44) 



(2T - 2^ + 2m' + 1 )(r - m - /' + m' ) ! " 2 
2^ ^ - 2/' + 1 )(f -/,,-/' - W f) ! J y '" W 



(45) 



c-ir 



(2T - 2f + 2m' 4- 1 XT + m ~£ + m') I 
2"' (2/* -2/' + !)(/" + /«-/' -m')! . 



and the final expression for the hydration energy of the antibody in the bound state, 
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(46) 



(47) 
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where $i,m,i\m' is defined by the above two equations; note that fam.iw is zero for m f m. 



The hydration energy of the unbound antibody is obtained by setting d=0 and R=a 
inEq.(46), 



s 2 1j w^fiEnGU 



(49) 



/=0 m— / 



where y tm is defined by Eqs. (48) and (49). Then, y 4w is written as a function of both 7 and 
m for notational convenience, although there is no formal dependence on m. 

Thus AGvor has been expressed as a function of the multipoles of the antibody 
charge distribution, Q'/,m (expanded about the center of the antibody sphere) and the 
elements a/, m , famjw and yi ttn which do not depend on Q V Combining Eqs. (26), (47) 
and (49) gives 



Of) / 



CO / 



2 2 2 2 ^fl&fl'w -2j Z Jt-es-a* 



Note that only the oc/, m depend on the antigen charges, while the p/ >w ,/>' and y/, m 
depend solely on the geometry of the bound and unbound states. While AG var opt is a real 
quantity, the a/,„, and Q'/,,„ are complex and the products a/, m Q'*/,„, and Q'*i, m Q'r, m involve 
summations over terms of the form Y* /',„,(0', (j)')Y/, m (e, <j>); note that the $i, m ,i; m > and yi, m are 
real. Then AG var op ' is rewritten in terms of the real and imaginary parts of a/,,„ and Q'/,m, 

(51) 



<*> r / 



CO OO r 



A, o/,o 



(-0 ^-0 



^2 A^/. ra (Reeu Re e;> + i™e;,,„ime; v ) 



CO 



(where summations over m are excluded for /=0) by noting again that Y/,.,„(0, (j)> 
(-l)"'Y*u(e j and 



36- 



WO 2005/011376 



PCT/US2004/024200 



^r^AnY*! JOS (52) 

^ReY^e^.ReY,^^^^ (53) 

The new variables ReQ f / |W and ImQ' are re-indexed and 
renamed Q z as follows, 

RcQ\ 2s . . . ^{Qi^Qa^Qs.Qa.Q^Qs, ■ • • }- (54) 

5 and similar transformations are used to create a if and y/. Eq. (51) can then be written 
as 



10 



25 



co cm co cxy 



OO 00 00 



= Yj */G + 2 2 (ft - *QJdQi Q] (56) 
or in matrix notation, 

— >T» — a — » — *T — * 

AGvar =2 B 2+2 A (57) 

(—+ l**tj_| - — *( — ► 1 «— 1 —A 1 — ►T ♦--»_{ — y 
Q±-B l Aj B]Q+ - B Aj - -/fB M (58) 



— * 



where Q is tlie vector formed by the Q„ A is the vector formed by the a h 

% is the symmetric matrix formed by the (p 7 y-S /:/ yi) t and completion of the square has 
been used to arrive at Eq. (58). Since 



O r S 0 • 

15 ^ in Eq. (57) corresponds to the antibody desolvation penalty, which must be 

greater than zero for chemically reasonable geometries, the matrix * is positive definite 
and the extreme of AG var is a minimum (G. Strang,' Introduction to Applied Mathematics 
(Wellesley-Cambridge Press, Wellesley, Mass., 1986). 

20 From Eq. (58) the optimum values of the multipoles, Q opt and the minimum 

variational binding energy, AG var opt are obtained, 

Q'f" = _ ~b~ 1 a (59) 
MS— ' (60) 



4 



AG w op ' is always negative because S is also positive definite. 

To solve for the optimal multipole distribution with the monopole (total charge) 
fixed (Qi^^), the equation for the remaining optimal multipoles (#1) is, 
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10 



15 



(61) 



which is analogous to Eq. (59). 

The above matrix equations, with the dimension truncated at i m ax=(lmax+l) > can be 
solved numerically by relatively modest computational resources. In practice, since the a/ 
and p,y, contain a summation over an infinite number of terms, a second cutoff value of 1^, 
must be used to truncate the innermost sum in Eqs. (25) and (46). When \ max and are 
sufficiently large, AG var ° pt converges and the incremental advantage of including more 
multipoles essentially vanishes. 

For any' given antigen and geometry, the present description has thus described a 
method to determine the charge distribution of the tightest binding antibody as a set of 
multipoles. The deviation of the binding free energy from the optimum for any test 
antibody 

can be calculated by subtracting Eq. (60) from Eq. (58) and using Eq. (59) to eliminate A, 



Table 1 - Vector Addition Coefficients 



rrt*«0 



at - -1 



\ ~ v + i 



— wrir~J [ 0 )! l — — J 



i « r - 1 



[ 2I'<2T + 1) J [ l r (2)'4l> J [ 21^ + 3) 
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EXAMPLE 1 

METHODS OF IMPROVING THE ANTIGEN-BINDING AFFINITY 

OF AN ANTI-INTEGREV ANTIBODY 

5 In this example, methods for improving the binding affinity of an antibody against 

a therapeutically relevant antigen target, are described. 

As proof-of-principle, the method of the invention was applied to an antibody 

against VLA-1 integrin, a cell-surface receptor for collagen and laminin, and in particular, 

the monoclonal antibody AQC2, which was raised against the human VLA-1 receptor by 

10 affinity maturation in mice. AQC2 inhibits the pathological processes mediated by VLA- 
1 integrin (see, e.g., WO 02/083854). 

A variant of ACQ2 with two mutations binds to VLA-1 with 100-fold less affinity 
than the wild-type antibody. In an effort to restore this binding, electrostatic charge 
optimization techniques were applied to a crystal structure of the antibody-antigen 

1 5 complex in a two-level procedure to suggest improved-affinity mutants. First, 

electrostatic charge optimization was used to determine the position(s) of the CDR 
residue(s) that are sub-optimal for binding (Lee and Tidor, J. Chem. Phys. 106:8681- 
8690, 1997; Kangas and Tidor, J. Chem. Phys. 109:7522-7545, 1998). Second, a set of 
CDR mutations were then determined for further computational analysis. Based on these 

20 calculations, the binding affinity was determined for 36 modified antibodies having a 
single mutation {i.e., 36 "single mutants") and 10 antibodies having two mutations (i.e., 
ten "double mutants"). It was predicted that 26 of the single mutants would be 
electrostatically favorable relative to the wild-type antibody, and that 15 would bind 
better with a full energy function including a van der Waals energy term and a solvent 

25 accessible surface area term. These terms are unrelated to electrostatic forces, but they 
were calculated to ensure that the designed mutations did not contact other residues and 
would not reduce the amount of buried surface area significantly; increased buried surface 
area in complex formation is usually beneficial (see the "Full Energy" column of the table 
below). Additionally, it was predicted that many of the double mutants would be more 

30 favorable than the wild-type complex and that the effects would be partially additive with 
respect to the single mutants. 

The mutation predictions can be categorized as involving (1) mutations at the 
interaction interface involving residues that become partially buried upon binding 
(interactions are improved by making hydrogen bonds with the antibody); (2) mutations 

35 of polar residues on the antibody that become buried upon binding and thus pay a 

desolvation penalty but do not make any direct electrostatic interactions with the antibody 
(improvements are usually made by mutation to a hydrophobic residue with similar shape 
to the wild-type residue or by adding a residue that can make favorable electrostatic 
interactions); and (3) mutations of surface residues on the antibody that are in regions of 
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uncomplementary potentials. These mutations are believed to improve long-range 
electrostatic interactions between the antibody and antigen without perturbing packing 
interactions at the binding interface. 

Based on results from a charge optimization, mutations were determined for 
computational analysis (the optimal charge distributions and design mutations that were 
closer to optimal than the current residue were examined; this process was done by 
inspection). A charge optimization gave charges at atom centers but did not yield actual 
mutation(s). A round of charge optimizations was performed with various constraints 
imposed to represent natural side chain characteristics. For example, an optimization was 
performed for a net side chain charge of -1, 0, and +1 with the additional constraint that 
no atom's charge exceeded an absolute value of 0.85 electron charge units. 

The crystal structure of the VLA-1/AQC2 complex (PDB code: 1MHP) was 
prepared using standard procedures for adding hydrogens with the program CHARMM 
(Accelrys, Inc., San Diego, CA). N-acetamide and N-methylamide patches were applied 
to the N termini and C-termini, respectively. There was missing density for residues 288- 
293 in one of the complexes (Model 1), but no attempt was made to rebuild the density. 
Using a continuum electrostatics model, an electrostatic charge optimization was 
performed on each side chain of the amino acids in the CDRs of the ACQ2 antibody. 
Appropriate side chain mutations were then determined based on the potential gain in 
electrostatic binding energy observed in the optimizations. Side chains were built by 
performing a rotamer dihedral scan in CHARMM, using dihedral angle increments of 60 
degrees, to determine the most desirable position for each side chain. Binding energies 
were then calculated for the wild type and mutant complexes using the Poisson- 
Boltzmann electrostatic energy and additional terms for the van der Waals energy and 
buried surface area. 

The crystal structure of the a 1 integrin I-domain (VLA-1) complexed with the 
Fab fragment of a humanized neutralizing antibody (AQC2) was solved to 2.8A at a pH 
of 7.40. There were two complexes within the asymmetric unit cell. A manganese (MN) 
atom was at the complex interface in both complexes, with most of its interactions 
coming from the I-domain. AsplOl from the antibody mimics a collagen glutamate 
interaction. 

The following table shows the optimization results obtained for CDR variable 
loop 2 in the heavy chain of AQC2. The Mut (Mutation energy) column corresponds to 
the binding free energy difference (in kcal/mol) in going from the native residue to a 
completely uncharged sidechain isostere, i.e., a residue with the same shape but no 
charges or partial charges on the atoms. Negative numbers indicate a predicted increase 
of binding affinity. The Opt-1 column corresponds to the binding free energy difference 
that can be obtained with an optimal charge distribution in the side chain and a net side 
chain charge of -1 . The columns OptO and Optl correspond to the binding free energy 
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differences with optimal charges, the net charge being 0 and +1, respectively. Based on 
these results and the visual inspection of the structure, mutations are designed that can 
take advantage of these binding free energy improvements. For instance, the mutation 
from THR50 to VAL, which is an uncharged isostere, makes use of the predicted -0.52 
kcal/mol in the mutation energy. The mutation LYS64 to GLU uses the -1 .42 kcal/mol 
predicted maximal free energy gain for a mutation to a side chain with a net charge of -1 . 

The selection of mutant designs were further explored computationally according 
to the following rules. i 

For example, in those instances in which mutation energy (Mut, corresponding to 
the binding free energy difference (in kcal/mol) associated with a transition from the 
native residue to a completely uncharged side chain isostere, Le., a residue with the same 
shape but no charges or partial charges on the atoms) was modeled to be favorable (e.g., 
AG < -0.25 kcal/mol), mutations from the set of amino acids with nonpolar sidechains, 
e.g., Ala, Cys, He, Leu, Met, Phe, Pro, Val were selected. 

Where Opt-1 energy (corresponding to the binding free energy difference that can 
be obtained with an optimal charge distribution in the side chain and a net side chain 
charge of-1) was favorable (e.g., AG < -0.25 kcal/mol), mutations from the set of amino 

« 

acids with negatively charged side chains, e.g., Asp, Glu were selected. 

Similarly, where Opt-f 1 energy (corresponding to the binding free energy 
difference that can be obtained with an optimal charge distribution in the side chain and a 
net side chain charge of +1) was favorable (e.g., AG < -0.25kcal/mol), mutations from the 
set of amino acids with positively charged sidechains, e.g., Arg, His, Lys were selected. 

Finally, in those cases in which OptO energy (corresponding to the binding free 
energy difference that can be obtained with an optimal charge distribution in the side 
chain and a net side chain charge of 0) was favorable (e.g., AG < -0.25kcal/mol), 
mutations from the set of amino acids with uncharged polar sidechains, e.g., Asn, Cys, 
Gin, Gly, His, Met, Phe, Ser, Thr, Trp, Tyr, to which are added Cys, Gly, Met and Phe 
were selected. 
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Table 2 - Optimization results obtained for AQC2 CDR heavy chain variable loop 2 



Number 


Residue 


Mut 


Opt-1 

— ==== 


OptO 

===== 


Opt1 


50 


THR 


-0.52 


0.3 


-1.24 


3.17 


51 


ILE 


0 


-1.05 


-0.91 


-0.56 


52 


SER 


0.39 


6.33 


-0.09 


1.77 


53 


GLY 


— 


— 


— 


— 


54 


GLY 


— 


— 


— 


— 


55 


GLY 


— 


— 


-— 


— 


56 


HSD 


•0.2 


-0.09 


-0.68 


-0.02 


57 


THR 


0.05 


-0.77 


-0.61 


-0.3 


58 


TYR 


-0.13 


-1.98 


-1.37 


3.06 


59 


TYR 


0.03 


-1.35 


-0.91 


-0.39 


60 


LEU 


0 


-1.39 


-1.08 


-0.71 


61 


ASP 


0.56 


-0.11 


0.25 


0.64 


62 


SER 


0.01 


-0.21 


-0.08 


0.08 


63 


VAL 


0 


•0.98 


-0.73 


-0.36 


64 


LYS 


-0.55 


-1.42 


-1.23 


-0.97 



As described before, the designed mutants are built in silico and the binding 
energy is recalculated. Results from these computational mutation calculations are shown 
below. Numbers represent change in binding affinity from wild-type to the mutant 
(negative meaning mutant is more favorable). Energies are the average of the two 
models. 
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Table 3 - Computational mutation calculations for AQC2 CDRs 



Heavy Chain Modifications 


Mutation 


Electrostatics 


Full Energy 


Type 


Asp106Asn 


-0.1 


-0.1 


3 


Arg31Gln 


-2.2 


2.3 


1 


Arg31Giu 


-0.8 


4.9 


1 


Arg31 Lys 


0.5 


2.7 


1 


Arg31Phe 


0.9 


2.8 


2 


Tyr32Phe 

* 


-0.4 


0.6 


2 


Ser35Asn 


-1.3 


-1.3 


2 


Ser35Gln 


-0.6 


-0.7 


2 


Thr50Val 


-1.2 


-1.7 


2 


His56Phe 


-0.8 


-0.8 


2 

* 


Tyr58Asn 


-0.3 


5.1 


3 


Tyr58Asp 


-2.0 


3.2 


3 


Tyr58Gln 


-2.4 


2.1 


3 


Tyr58Glu 


-1.2 


3.1 


3 


Tyr59Asp 


-0.6 


-0.6 


3 


Try59Glu 


-0.5 


-0.5 


3 


Leu60Asp 


-0.1 


-0.1 


3 


Leu60GIu 


-0.3 


-0.3 


3 


Lys64Asn 


-0.6 


-0.5 


3 


Lys64Asp 


-0.9 


-0.8 


3 


Lys64Gln 


-0.6 


-0.5 


3 


Lys64Glu 


-0.9 


-0.9 


3 
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Computational mutation calculations for AQC2 CDRs (continued) 



Light Chain Modifications 


Asn30Ala 


-0.1 


1.1 


2 j 


Asn30Ile 


0.5 


0.2 


2 


Asn30Leu 


-0.3 


-0.5 


2 


Asn30Val 


-0.5 


-0.2 


2 


His31Arg 


1.6 


1.9 


1 


His31Lys 


-0.7 


1.3 


1 


Leu49Arg 


1.0 


0.0 


1 


Leu49His 


2.4 


0.6 




Leu49Lys 


-0.1 


-1.1 




Asn52Arg 


0.1 


0.1 




Asn52His 


2.8 


-0.2 




Asn52Lys 


0.3 


1.5 




Trp95Asp 


2.5 


4.4 


3 


Trp95Glu 


0.7 


2.9 


3 



As the results show, the computational process described above was successfully 

5 implemented to predict affinity enhancing side chain mutations. These findings were 
classified into three general classes of mutations. The first type of mutation involves 
residues at the interface across from a charged group on the antigen capable of making a 
hydrogen bond; the second involves buried polar residues that pay a desolvation penalty 
upon binding but do not make back electrostatic interactions; and the third involves long- 

10 range electrostatic interactions. 

The first type of mutation is determined by inspection of basic physical/chemical 
considerations, as these residues essentially make hydrogen bonds with unsatisfied 
hydrogen partners of the antigen. Surprisingly, it was observed that the cost of 
desolvation seemed to outweigh the beneficial interaction energy in most cases. The 

15 second type of mutation represents a less intuitive type or set of mutations, as the energy 
gained is primarily a result of eliminating an unfavorable desolvation while maintaining 
non-polar interactions. The third mutation type concerns long-range interactions that 
show potential for significant gain in affinity. These types of mutations are particularly 
interesting because they do not make direct contacts with the antigen and should, 

20 therefore, pose less of a perturbation in the delicate interactions at the antibody-antigen 
interface. 

In accordance with the computational data obtained as described above, mutants 
of ACQ2 (single chain Fv mutants) were generated, and their affinity was measured by 
the KinExA™ assay described above. The mutants generated to date are shown in the 

-44- 



v 4UVJ,VUJ /0 PCT/US2004/024200 

table that follows. Where an affinity assay has been conducted, the results are shown in 
the column headed "Kd." The affinity of the original ACQ2 single chain Fv was 25 nM. 

Table 4 - Observed affinity values for AQC2 altered antibodies 

5 



Heavy Chain 
Modifications 


Mutant 


Kd 


R31Q 


8.2 nM 


Y32F 


34 nM 


S35N 


39 nM 


S35Q 


37 nM 


T50V 


14 nM 


L60D 


21 nM 


K64E 


38 nM 


K64Q 


12 nM 


K64D 


6.3 nM 


K64N 


4.1 nM 


D106N 


67 nM 


Light Chain 
Modifications 


N30V 


8.9 nM 


H31R 


31 nM 


N52K 


49 nM 


N52R 


17 nM 


N52H 


43 nM 



The following alterations in AQC2 were also made: heavy chain modifications 
R31K, R31F, R31E, H56F, Y58E, Y58Q, Y59D, Y59E, L60E; light chain mutations 
N30L, N30A, N30I, H31K, L49K, L49R, L49H, W95E, W95D. 
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EXAMPLE 2 

METHODS OF IMPROVING THE ANTIGEN-BINDING AFFINITY 

OF AN ANTI-CD154 ANTD30DY 

In this example, methods for improving the binding affinity of an antibody against 
a therapeutically relevant antigen target, are described. 

An antibody against human CD 154 (also known as CD40 ligand or CD40L; see, 
e.g., Yamada et al, Transplantation, 73:S36-9 (2002); Schonbeck et al, Cell. Mol. Life 
Sci. 58:4-43 (2001); Kirk et al, Philos. Trans. R. Soc. Lond. B. Sci. 356:691-702 (2001); 
Fiumara etal, Br. J. Haematol. 113:265-74 (2001); and Biancone etal, Int. J. Mol. Med. 
3(4):343-53 (1999)) which is a member of TNF family of proteins involved in mediating 
immunological responses, was raised by affinity maturation in mice. The 5c8 monoclonal 
antibody was developed from such studies and determined to inhibit the pathological 
processes mediated by CD154/CD40L. 

In an effort to increase the affinity 5c8/CD40L interaction, electrostatic charge 
optimization techniques were applied to a crystal structure of the antibody- antigen 
complex in a two-level procedure to suggest improved-affinity mutants. First, 
electrostatic charge optimization was used to determine the position(s) of the CDR 
residue(s) that are sub-optimal for binding (Lee and Tidor, J. Chem. Phys. 106:8681- 
8690, 1997; Kangas and Tidor, J. Chem. Phys. 109:7522-7545, 1998). Second, a set of 
CDR mutations were determined for further computational analysis. Based on these 
calculations, the binding affinity was computationally determined for 23 modified 
antibodies having a single mutation (i. e. , 23 "single mutants"). It was predicted that 8 of 
the single mutants would be more favorable than wild-type antibody both in terms of 
electrostatic energy, and in terms of full energy function including a van der Waals 
energy term and a solvent accessible surface area term. These terms are unrelated to 
electrostatic forces, but they were calculated to ensure that the designed mutations did not 
contact other residues and would not reduce the amount of buried surface area _ 
significantly; increased buried surface area in complex formation is usually beneficial 
(see the "Full Energy" column of the table below). 

The mutation predictions can be categorized as involving (1) mutations at the 
interaction interface involving residues that become partially buried upon binding 
(interactions are improved by making hydrogen bonds with the antibody); (2) mutations 
of polar residues on the antibody that become buried upon binding and thus pay a 
desolvation penalty but do not make any direct electrostatic interactions with the antibody 
(improvements are usually made by mutation to a hydrophobic residue with similar shape 
to the wild-type residue or by adding a residue that can make favorable electrostatic 
interactions); and (3) mutations of surface residues on the antibody that are in regions of 
uncomplementary potentials. These mutations are believed to improve long-range 
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electrostatic interactions between the antibody and antigen without perturbing packing 
interactions at the binding interface. 

Based on results from a charge optimization, mutations were determined for 
computational analysis (the optimal charge distributions and design mutations that were 

5 closer to optimal than the current residue were examined; this process was done by 

inspection). A charge optimization gave charges at atom centers but did not yield actual 
mutation. A round of charge optimizations was performed with various constraints 
imposed to represent natural side chain characteristics. For example, an optimization was 
performed for a net side chain charge of -1, 0, and +1 with the additional constraint that 

10 no atom's charge exceeded an absolute value of 0.85 electron charge units. 

The crystal structure of the CD40L/5c8 complex (PDB code: 1I9R) was prepared 
using standard procedures for adding hydrogens with the program CHARMM (Accelrys, 
Inc., San Diego, CA). N-acetamide and N-methylamide patches were applied to the N 
termini and C-termini, respectively. Using a continuum electrostatics model, an 

1 5 electrostatic charge optimization was performed on each side chain of the amino acids in 
the CDRs of the ACQ2 antibody. Appropriate side chain mutations were then determined 
based on the potential gain in electrostatic binding energy observed in the optimizations. 
Side chains were built by performing a rotamer dihedral scan in CHARMM, using 
dihedral angle increments of 60 degrees, to determine the most desirable position for each 

20 side chain. Binding energies were then calculated for the wild type and mutant 

complexes using the Poisson-Boltzmann electrostatic energy and additional terms for the 
van der Waals energy and buried surface area. 

The crystal structure of the CD40 ligand complexed with the Fab fragment of a 
humanized neutralizing antibody (5c8) was solved to 3.1 A at a pH of 6.50. Since CD40L 

25 is naturally a trimer, there are three 5c8 Fab molecules and 5 CD40L molecules in the 
complex. They form three independent CD40L/5c8 interfaces in the complex. A zinc 
(ZN) atom was bound to each of the 5c8 Fab and it was included into the calculation. 
Calculations were carried out independently for three interfaces and the amino acid 
substitutions that were found to be favorable over wild type for all three sites were 

30 exploited. 

The following table shows the optimization results obtained for CDR variable 
loop 1 in the light chain of 5c8 for all three 5c8 molecules. The Mut (Mutation energy) 
column corresponds to the binding free energy difference (in kcal/mol) in going from the 
native residue to a completely uncharged sidechain isostere, i.e., a residue with the same 
35 shape but no charges or partial charges on the atoms. Negative numbers indicate a 

0 

predicted increase of binding affinity. The Opt-1 column corresponds to the binding free 
energy difference that can be obtained with an optimal charge distribution in the side 
chain and a net side chain charge of-1. The columns OptO and Optl correspond to the 
binding free energy differences with optimal charges, the net charge being 0 and +1, 
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respectively. Based on these results and the visual inspection of the structure, mutations 
are designed that could take advantage of these binding free energy improvements. For 
instance, the mutation from SER 3 1 to VAL, which is an uncharged isostere, makes use 
of the predicted -1 .23 to -0.98 kcal/mol in the mutation energy. The mutation GLN 27 to 
5 GLU uses the -1 .21 to -0.88 kcal/mol predicted maximal free energy gain for a mutation 
to a side chain with a net charge of -1 . 
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Table 5 - Optimization results obtained for 5c8 CDR light chain variable loop 1 



Chain Residue 


Mut 


Opt-1 


OptO 


Opt1 


1L 


24ARG 


-0.11 


0.17 


-0.11 


-0.37 


1L 


26 SER 


-0.06 


-0.59 


•0.06 


0.57 


1L 


27 GLN 


0.21 


-1.21 


-0.95 


-0.26 


1L 


28 ARG 


0.11 


-0.96 


-0.71 


-0.40 


1L 


30 SER 


-0.01 • 


-0.14 


-0.42 


-0.47 


1L 


31 SER 


-1.23 


3.88 


•2.16 


-0.42 


1L 


32 SER 


1.45 


0.91 


-0.65 


•0.67 


1L 


33 THR 


-0.02 


-0.66 


•0.41 


0.07 


1L 


34 TYR 


-0.25 


i 

-1.00 


-1.10 


-0.80 


1L 


35 SER 


-0.02 


0.00 


-0.11 


0.04 


1L 


36 TYR 


0.01 


-0.95 


-1.31 


1.74 


1L 


38 HSD 


-0.15 


•0.48 


-0.70 


•0.62 


2L 


24 ARG 


-0.46 


-1.04 


-0.46 


0.13 


2L 


26 SER 


-0.29 


-1.60 


-0.79 


0.19 


2L 


27 GLN 


0.26 


-0.88 


-0.41 


0.35 


2L 


28 ARG 


-0.59 


-0.94 


-0.46 


0.08 


2L 

Ml 


30 SER 


0 08 


-0 38 


-0 55 


-0 42 


2L 


31 SER 


-0.98 


4.04 


-1.89 


-0.54 


2L 


32 SER 


0.74 


2.31 


-0.86 


-0.87 


2L 


33 THR 


0.00 


-0.65 


-0.38 


0.09 


2L 


34 TYR 


-0.09 


-0.62 


-0.48 


-0.12 


2L 


35 SER 


0.09 


0.02 


0.09 


0.18 


2L 


36 TYR 


0.10 


-1.70 


-1.24 


2.37 


2L 


38 HSD 


-0.23 


-1.20 


-1.17 


-0.79 


3L 


24 ARG 


-0.35 


-0.34 


-0.35 


-0.35 


3L 


26 SER 


-0.27 


-1.23 


-0.53 


0.27 


3L 


27 GLN 


0.11 


-1.07 


-0.71 


-0.08 


3L 


28 ARG 


-0.30 


-0.85 


-0.30 


0.15 


3L 


30 SER 


0.03 


0.02 


-0.29 


-0.36 


3L 


31 SER 


-1.06 


4.02 


-2.03 


-0.90 


3L 


32 SER 


0.82 


1.18 


-0.85 


-1.05 


3L 


33 THR 


0.20 


-0.32 


-0.15 


0.29 


3L 


34 TYR 


0.09 


-0.80 


-0.74 


-0.38 


3L 


35 SER 


0.06 


-0.05 


-0.10 


-0.02 


3L 


36 TYR 


0.04 


•0.99 


-1.30 


1.66 


3L 


38 HSD 


-0.20 


-0.46 


-0.76 


-0.72 



As described before, the designed mutants were built in silico and the binding 
5 energy was recalculated. Results from these computational mutation calculations are 
shown below. Nymbers represent change in binding affinity from wild-type to the mutant 
(negative meaning mutant is more favorable). Energies for all three chains of 5c8 are 
given. 
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Table 6 - Computational mutation calculations for 5c8 CDRs 

Chain Mutant Full Energy Electrostatics 



1HTYR33PHE 


0.197 


-2.741 


1 H ASN59ASP 


-0.995 


-2.548 


1H ASN59LEU 


-1.294 


-2.517 


1 L SER26ASP 


-0.703 


-0.712 


1LGLN27GLU 


-0.514 


-0.357 


1LSER31VAL 


8.154 


-1.739 


1 L THR33ASP 


-0.219 


-0.916 


TU 1 YK04V3LU 


.0 999 


-0 729 


2H TYR33PHE 


0.623 


-2.726 


2H ASN59ASP 


-0.21 8 


-2.885 


2H ASN59LEU 


-1.116 


-3.067 


2L SER26ASP 


-1.333 


-1.627 


2L GLN27GLU 


-0.658 


-0.395 


2L SER31VAL 


9.293 


-0.832 


2L THR33ASP 


-0.430 


-1.359 


2L TYR54GLU 


-1.012 


-1.030 


3H TYR33PHE 


0.145 


-1.979 


3H ASN59ASP 


-0.837 


-2.267 


3H ASN59LEU 


-1.179 


-2.271 


3L SER26ASP 


-0.540 


-0.565 


3LGLN27GLU 


-0.497 


-0.342 


3L SER31VAL 


8.129 


-1 .284 


3L THR33ASP 


-0.337 


-0.676 


3L TYR54GLU 


-1.123 


•0.825 



As the results show, the computational process described above was successfully 
implemented to predict affinity enhancing side chain mutations. These findings have 
been classified into three general classes of mutations. The first type of mutation involves 
residues at the interface across from a charged group on the antigen capable of making a 
hydrogen bond; the second involves buried polar residues that pay a desolvation penalty 
upon binding but do not make back electrostatic interactions; and the third involves long- 
range electrostatic interactions. 

The first type of mutation was resolved by inspection, as these residues essentially 
make hydrogen bonds with unsatisfied hydrogen partners of the antigen. Surprisingly, the 
cost of desolvation seemed to outweigh the beneficial interaction energy in most cases. 
The second type of mutation represents a less intuitive type or set of mutations, as the 
energy gained is primarily a result of eliminating an unfavorable desolvation while 
maintaining non-polar interactions. The third mutation type concerns long-range 
interactions that show potential for significant gain in affinity. These types of mutations 
are particularly interesting because they do not make direct contacts with the antigen and, 
therefore, pose less of a perturbation in the delicate interactions at the antibody-antigen 
interface. 
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In accordance with the computational data obtained as described above, mutants 
of 5c8 (Fab fragments) were generated, and their affinity towards CD40L was measured 
by the KinExA™ assay described above. Selected results of some of the mutants 
generated to date are shown in the table that follows. Where an affinity assay has been 
5 conducted, the results are shown in the column headed "IC50." The affinity of the 
original 5c8 Fab to CD40L was 0.81 nM. 

Table 7 - Observed affinity values for 5c8 altered antibodies 



Mutant 


IC50 


Light 




S26D 


0.26 nM 


Q27E 


0.12 nM 



10 

Accordingly, it was concluded that the methods of the invention allow for the 
affinity maturation of a an antibody of therapeutic relevance. 
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Equivalents 

For one skilled in the art, using no more than routine experimentation, there are 
many equivalents to the specific embodiments of the invention described herein. Such 
equivalents are intended to be encompassed by the following claims. 
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(57) ABSTRACT 

The present computer-implemented process involves' a 
methodology for determining properties of ligands which in 
turn can be used for designing ligands for binding with 
protein or other molecular targets, for example, HIV targets. 
The methodology defines the electrostatic complement for a 
given target site and geometry. The electrostatic complement 
may be used with steric complement for the target site>to 
discover ligands through explicit construction and through 
the design or bias of combinatorial libraries. The definition 
of an electrostatic complement, Le., the optimal tradeoff 
between unfavorable desolvation energy and favorable inter- 
actions in the complex, has been discovered to be useful in 
ligand design. This methodology essentially inverts the 
design problem by defining the properties of the optimal 
ligand based on physical principles. These properties pro- 
vide a clear and precise standard to which trial ligands may 
be compared and can be used as a template in the modifi- 
cation of existing ligands and the de novo construction of 
new ligands. The electrostatic complement for a given target 
site is defined by a charge distribution which minimizes the 
electrostatic contribution to binding at the binding sites on 
the molecule in a given solvent. One way to represent the 
charge distribution in a computer system is as a set of 
multipoles. By identifying molecules having point charges 
that match this optimum charge distribution, the determined 
charge distribution may be used to identify ligands, to design 
drugs, and to design combinatorial libraries. 
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COMPUTER SYSTEM AND PROCESS FOR 
IDENTIFYING A CHARGE DISTRIBUTION 

WHICH MINIMIZES ELECTROSTATIC 
CONTRIBUTION TO BINDING AT BINDING 
BETWEEN A LIGAND AND A MOLECULE IN 
A SOLVENT AND USES THEREOF 

RELATED APPLICATIONS 

This application claims the benefit under 35 USC §119(e) 
of U.S. Provisional Patent Application Serial No. 60/042, 
692 filed on Apr. 4, 1997, entitled COMPUTER SYSTEM 
AND PROCESS FOR IDENTIFYING A CHARGE DIS- 
TRIBUTION WHICH MINIMIZES ELECTROSTATIC 
CONTRIBUTION TO BINDING AT BINDING 
BETWEEN A LIGAND AND A MOLECULE IN A SOL- 
VENT AND USES THEREFOR. The contents of the pro- 
visional application are hereby expressly incorporated by 
reference. 

GOVERNMENT SUPPORT 

This work was funded in part by grant numbers GM 
47678 and GM 56552 from the National Institutes of Health. 
Accordingly, the United States Government may have cer- 
tain rights to this invention. 

FIELD OF THE INVENTION 

The present invention relates to rational drug design, and 
more particularly, to rational drug design based upon the 
prediction of a charge distribution on a ligand which mini- 
mizes the electrostatic contribution to binding between the 
ligand and its target molecule in a solvent. The present 
process also relates to methods and tools for making such 
predictions and enhanced-binding ligands, and to the diag- 
nostic and therapeutic uses of the ligands so produced. 

BACKGROUND OF THE INVENTION 



Methods for computational rational drug design include 
two general approaches: those that screen whole molecules 
and those that probe local sites and construct molecules 
through the joining of molecular fragments or grafting of 
chemical moieties onto a parent structure. DOCK is an 
example of a whole-molecule algorithm which uses a pro- 
cedure to find the complementary shape to a given target site 
(I. D. Kuntz, et al„ J. Mol. Biol. 161:269 (1982) (Kuntz); R. 
L. DesJarlais, et al., J. Med. Chem. 31:722 (1988) 
(DesJarlais)). Large compound databases can be computa- 
tionally "screened" by first eliminating molecules whose 
shape is incompatible with the target site (by computing an 
overlap with the complementary shape) and then by attempt- 
ing to rank those that remain with an approximate energy 
function. This procedure has been successful at identifying 
a number of ligands that bind to target sites. Unfortunately, 
X-ray crystal studies have shown that the ligands often bind 
differently in the site than predicted. One possible reason for 
this discrepancy between prediction and reality is that 
although the shape-complementarity algorithm is effective 
at removing extremely incompatible trial ligands, the 
approximate energy function is too inexact to define higher- 
level details of binding. 

The MCSS (Multiple Copy Simultaneous Search) algo- 
rithm is one of the most popular fragment based approaches 
to ligand design (P. J. Goodford, J. Med. Chem. 28:849 
(1985); A. Miranker and M. Karplus, Proteins: Struct., 
Funct., GeneL 11:29 (1991); and A. Caflisch, et al., J. Med. 
Chem. 36:2142 (1993)). The essential idea is to search the 



region of a binding site and determine locations having 
especially favorable interaction energy with probes that 
represent a library of functional groups (carbonyl, amide, 
amine, carboxylate, hydroxyl, etc.). After the probes are 

5 successfully placed in the binding site, various subsets are 
linked to form coherent molecules. Two approaches to this 
problem have been developed. One attempts to fit small 
molecules from a database to join functional groups 
(HOOK) (M. B. Eisen, et al, Proteins: Struct., Funct., 

10 Genet. 19:199 (1995) and the other uses a simulated anneal- 
ing protocol to grow linker atoms and bonds between 
fragments to produce ligand candidates with good covalent 
geometry and non-bonded interactions (DLD, dynamic 
ligand design) (A. Miranker and M. Karplus, Proteins: 

15 Struct., Funct., Genet. 11:29-34 (1991) and 23:472 (1995)). 
The current methods for rational drug design are useful 
for suggesting novel and provocative geometries that appear 
to roughly compensate hydrogen-bonding groups in the site. 
Unfortunately, the current methods use approximations 

20 which may be inaccurate and which result in difficulties in 
accurately ranking candidates. Thus, although a number of 
computational algorithms exist both for the analysis of 
binding sites and bound complexes and for the rational 
design of ligands and other drug candidates, structure-based 

25 design remains an imprecise and non-deterministic 
endeavor. 

SUMMARY OF THE INVENTION 

30 The limitations of the prior art are overcome by providing 
for (i) a rigorous treatment of solvation, dielectric, and 
long-range electrostatic effects operating in both the 
unbound and the bound state of the target molecule and the 
ligand candidate, and (ii) a detailed quantitative method for 
ranking suggested ligands. The present process is based 
upon the discovery that the crude treatment of solvent, 
long-range electrostatics, and dielectric effects, as well as 
the lack of appropriate treatment for the unbound state of the 
target molecule and the ligand candidate, have limited the 
40 rational design and identification of novel ligand candidates 
for binding to a preselected target molecule. The present 
computer-implementation overcomes these limitations by 
providing a process which considers the exchange nature of 
ligand/target molecule binding, in which interactions with 
4S solvent are traded for interactions between a ligand and its 
complementary target molecule. In contrast to the prior art 
methods, the process disclosed herein takes into account 
solvent, long-range electrostatics, and dielectric effects in 
the binding between a ligand and its target receptor in a 
solvent. 

Accordingly, in one aspect, a process for identifying 
properties of a ligand for binding to a target molecule (e.g., 
receptor, enzyme) in a solvent given a representation of a 
shape of the target molecule in three dimensions is provided. 
55 The process involves selecting a shape of the ligand defined 
in three dimensions, which shape is complementary to 
(matches) a shape of a selected portion of the target mol- 
ecule; and determining a representation of a charge distri- 
bution on the ligand which minimizes the electrostatic 
60 contribution to binding between the ligand and the target 
molecule in the solvent In some embodiments, the repre- 
sentation of the charge distribution is a set of multipoles. In 
other embodiments, the process further involves the step of 
identifying a molecule having point charges that match the 
65 representation of the charge distribution. 

These methods are particularly useful for designing 
enhanced-binding ligands for binding to a target molecule 
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which has a known ligand. As used herein, an enhanced- 
binding ligand refers to a ligand which has a structure that 
is based upon that of a known ligand for the target molecule 
but which is modified in accordance with the methods 
disclosed herein to have a charge distribution which mini- 5 
mizes the electrostatic contribution to binding between the 
ligand and the target molecule in a solvent. Thus, the present 
computer-implemented process provides a method of ratio- 
nal drug design that identifies such improved ligands for 
binding to a target molecule having a known or predictable 
three-dimensional structure. The method involves selecting 
a shape of the ligand defined in three dimensions which 
matches a shape of a selected portion of the target molecule 
and determining a representation of a charge distribution on 
the ligand which minimizes electrostatic contribution to 
binding between the ligand and the target molecule in the 15 
solvent. 

The target molecules for which ligands are identified 
using the claimed process are molecules for which a repre- 
sentation of the three dimensional shape of the molecule is 
known or can be predicted. Such target molecules include 20 
biopolymers and non-biopolymers. Exemplary biopolymers 
include proteins, nucleic acids, lipids, carbohydrates, and 
mixtures of the foregoing (e.g., glycoproteins, lipoproteins 
and so forth). Exemplary non-biopolymers include 
polyamides, polycarbonates, polyalkylenes, polyalkylene 2 5 
glycols, polyalkylene oxides, polyalkylene terphthalates, 
polyvinyl alcohols, polyvinyl ethers, polyvinyl esters, poly- 
vinyl halides, polyvinylpyrrolidone, polyglycolides, 
polysiloxanes, polyurethanes, alkyl cellulose, polymers of 
acrylic and methacrylic esters, polyethylene, polypropylene, 30 
poly(ethylene glycol), poly(ethylene oxide), poly(ethylene 
terphthalate), poly(vinyl alcohols), polyvinyl acetate, poly- 
vinyl chloride, polystyrene, polyvinylpyrrolidone, polymers 
of lactic acid and glycolic acid, polyanhydrides, poly(ortho) 
esters, polyurethanes, poly(butic acid), poly(valeric acid), 35 
poly(lactide-cocaprolactone) and copolymers thereof. 

As used herein, the terms "protein" or "polypeptide" are 
used interchangeably to embrace a variety of biopolymers 
that are formed of amino acids, e.g., receptors, hormones, 
and enzymes. It should be understood that as described 40 
herein, references to a "protein", a "polypeptide", or a 
"receptor" are generally applicable to analogous structures, 
such as lipoproteins, glycoproteins, proteins which have 
other organic or inorganic groups attached, and multi-chain 
and multi-domain polypeptide structures such as large 45 
enzymes and viruses, and include non-biopolymers. In these 
instances, analogous issues regarding the electrostatic con- 
tribution to binding between the ligand and the protein 
molecule are present. 

In some embodiments, the target molecule is a protein and 50 
the present computer-implemented process is used to iden- 
tify novel and/or improved ligands for binding to a protein 
having a known three-dimensional structure in a solvent. 
Known binding partners of ligands and proteins include 
hormone/receptor, cofactor or inhibitor/enzyme, antigen/ 55 
antibody, and so forth. For proteins to which a ligand 
previously has been identified, the present process is used to 
identify the appropriate modifications to the known ligand 
structure to achieve a charge distribution on the "improved" 
ligand that minimizes the electrostatic contribution to bind- 60 
ing between the improved ligand and the protein compared 
to that of the unmodified (natural) ligand. Exemplary ligand/ 
protein binding partners used as starting points for identi- 
fying "improved" ligands in accordance with the present 
process are provided in the examples. 65 

In another aspect, a process identifying novel and/or 
enhanced-binding ligands that bind to a target molecule that 
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is a protein is provided. Proteins are known to fold into a 
three-dimensional structure which is dictated by the 
sequence of the amino acids (the primary structure of the 
protein) and by the solvent in which the protein is provided. 
The biological activity and stability of proteins are depen- 
dent upon the protein's three-dimensional structure. The 
three-dimensional structure of a protein can be determined 
or predicted in a number of ways. The best known way of 
determining a protein structure involves the use of X-ray 
crystallography. The three-dimensional structure of a protein 
also can be estimated using circular dichroism, light 
scattering, or by measuring the absorption and emission of 
radiant energy. Protein structure also may be determined 
through the use of techniques such as neutron diffraction, or 
by nuclear magnetic resonance (NMR). The foregoing meth- 
ods are known to those of ordinary skill in the art and are 
described in standard chemistry textbooks (e.g., Physical 
Chemistry, 4th Ed. Moore, W. J., Prentiss-Hall NJ. (1972) 
and Physical Biochemistry, Van Holde, K. E., Prentiss-Hall, 
NJ. (1971)). Using the foregoing techniques, a number of 
recurring patterns in naturally occurring proteins have been 
identified, the most common of which are alpha helices, 
parallel beta sheets and anti-parallel beta sheets. See, e.g., R. 
Dickerson, et al., The Structure and Action of Proteins 
(1969). Together, the helices, sheets and turns of a protein's 
secondary structure produce the three dimensional structure 
of the active molecule. The three dimensional structure of 
proteins can be determined empirically using physical bio- 
chemical analysis or, alternatively, can be predicted using 
model building of three dimensional structures of one or 
more homologous proteins which have a known three 
dimensional structure. 

The present computer-implemented process is particu- 
larly useful for designing an improved ligand that has a 
structure which is based upon the structure of a known 
ligand for a target molecule but which has been modified in 
accordance with the present methods to have a charge 
distribution which minimizes the electrostatic contribution 
to binding between the improved ligand and the target 
molecule in a solvent. Such improved ligands are referred to 
herein as "enhanced-binding ligands". Accordingly, the 
present process uses a ligand of known conformation as a 
starting point for the further optimization and selection of a 
ligand structure which will have reduced electrostatic con- 
tribution to binding to the molecule and the solvent. For 
example, the present process is used to produce an improved 
(enhanced-binding) co-factor or inhibitor of an enzyme 
(e.g., HIV-1 protease). 

The present computer-implemented process also provides 
for the design of an improved hormone or other ligand for 
optimum binding (minimized electrostatic contribution to 
binding) to fit any known receptor site. This process is 
particularly useful for drug design, since it permits drugs to 
be designed and manufactured which more selectively and 
more stably are capable of binding to the receptor site. The 
design of improved ligands for binding to receptors means 
that lower dosages can be used, thereby reducing the chance 
of side effects and/or toxicity that may be associated with 
higher dosages. The design of improved ligands for binding 
to receptors also permits the identification of drugs having 
greater efficacy than the original ligand which is used as the 
basis for identifying an improved ligand having improved 
binding properties. Accordingly, known ligands for a protein 
can be used as a starting point for the design of improved 
ligands, wherein the improvement is based upon the 
improved binding properties of the ligand to the protein that 
are attributed to the selection of a ligand having a charge 
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distribution which nunimizes the electrostatic contribution 
to binding between the ligand and the protein in a solvent. 
Thus, the present process permits the customizing of anti- 
gens and epitopes to more selectively and, with greater 
a ffini ty, bind to antibodies, and also provides for the design 
and selection of novel and/or improved ligands which bind 
to other receptors or target molecules. 

The ligands that are identified in accordance with the 
methods disclosed herein can be labeled with detectable 
labels such as radioactive labels, enzymes, chromophores 
and so forth for carrying out immuno-diagnostic procedures 
or other diagnostic procedures. These labeled agents can be 
used to detect the target molecules in a variety of diagnostic 
samples. For imaging procedures, in vitro or in vivo, the 
ligands identified herein can be labeled with additional 
agents, such as NMR contrasting agents or x-ray contrasting 
agents. Methods for attaching a detectable agent to a 
polypeptide or other small molecule containing reactive 
amino groups are know in the art The ligands also can be 
attached to insoluble support for facilitating diagnostic 
assays. 

The present computer-implemented process also is useful 
for searching three-dimensional databases for structures 
which have a shape which matches a shape of a selected 
portion of the protein and which also has a charge distribu- 
tion which minimizes electrostatic contribution to binding 
between the ligand and the protein in a solvent. 
Alternatively, three-dimensional databases can be selected 
on the basis of the shape of the ligand alone (so that it 
matches a shape of a selected portion of the protein) with 
further modification of the database molecules that satisfy 
this criteria to have a charge distribution which minimizes 
electrostatic contribution to binding between the modified 
ligand and the protein in a solvent. Search algorithms for 
three-dimensional database comparison are available in the 
literature. See, e.g., U.S. Pat No. 5,612,895, issued to V 
Balaji, et aL, "Method of Rational Drug Design Based on Ab 
Initio Computer Simulation of Conformational Features of 
Peptides" and references disclosed therein. For related com- 
puter methods for drug design, see also, U.S. Pat. No. 
5,081,584, issued to Omichinski et aL, "Computer-assisted 
Design of Anti-peptides Based on the Amino acid Sequence 
of a Target Peptide", and U.S. Pat. No. 4,939,666, issued to 
Hard man, "Incremental Macromolecule Construction Meth- 
ods". 

Each of the novel and/or "improved" ligands identified 
using the present process are prepared employing standard 
synthetic or recombinant procedures and then tested for 
bioactivity. Those compounds which display bioactivity are 
candidate peptidomimetics; those compounds which do not 
display bioactivity help further define portions of the ligand 
which are essential for binding of the ligand to the target 
molecule. As used herein, a peptidomimetics broadly refers 
to a compound which mimics a peptide. For example, 
morphine is a peptidomimetic of the peptide endorphin. 

A database of known compounds (e.g., the Cambridge 
Crystal Structure Data Base, Crystallographic Data Center, 
Lensfield Road, Cambridge, CB2 1EW, England; and Allen, 
F. H., et aL, Acta CrystaUogr., B35:2331 (1979)) also can be 
searched for structures which contain the steric (shape) 
parameters used for complementing (matching) a shape of a 
selected portion of the target molecule. Compounds which 
are found to contain the desired steric parameters are 
retrieved, and further analyzed to determine which of the 
retrieved compounds also have the desired charge distribu- 
tion or that can be modified to have the desired charge 
distribution to minimize the electrostatic contribution to 



binding between the ligand and the target molecule in a 
solvent. Ligands that are found to contain both the desired 
shape and charge distribution are additional candidates as 
peptidomimetics of the original target peptides, 
5 The ligands which are identified in accordance with the 
present computer-implemented process are evaluated for 
biological activity and/or for binding affinity to the target 
molecule. An iterative approach is used to identify the 
ligands having the most preferred biological properties. See, 
io e.g., PCT WO 19359, "Process for making Xanthene or 
Cubane based compounds, and Protease Inhibitors", which 
describes an iterative process for identifying the bioactive 
conformation of an enzyme inhibitor in a complex chemical 
combinatorial library. The bioactive conformation then is 
15 used to design peptidomimetics, or used to search a three- 
dimensional database of organic structures to suggest poten- 
tial peptidomimetics. Standard physiological, pharmaco- 
logical and biochemical procedures are available for testing 
the "improved" or novel ligands identified using the present 
20 process. The particular protocol for evaluating bioactivity is 
a function of the compound that is being tested. This kind of 
analysis can be applied to known ligands that bind- to a target 
molecule, (e.g., HIV protease, MHC class II proteins) to 
design enhanced-binding ligands for these biologically 
25 important target molecules. 



BRIEF DESCRIPTION OF THE DRAWINGS 
In the drawings, 

FIG. 1 is a block diagram describing one embodiment of 
30 the present computer-implemented process; 

FIG. 2 is a block diagram of a computer system which 
may be used to implement the present computer- 
implemented process; 
35 FIG. 3 is a diagram illustrating chemical principles under- 
lying the present computer-implemented process; and 

FIG. 4 is a diagram illustrating inhibitors of HIV-1 
protease. 

FIG. 5 is an illustration of problem geometries. 
40 DETAILED DESCRIPTION 

The present computer-implemented process will be more 
completely understood through the following detailed 
description which should be . read in conjunction with the 
45 attached drawing in which similar reference numbers indi- 
cate similar structures. All references cited above and in the 
following description are hereby expressly incorporated by 
reference. 

The the present computer-implemented process involves 

50 a methodology for determining properties of ligands which 
in turn is used for designing ligands for binding with protein 
or other molecular targets, for example, HTV targets. The 
methodology defines the electrostatic complement for a 
given target site and geometry. The electrostatic complement 

55 may be used with a steric complement for the target site to 
discover ligands through explicit construction and through 
the design or bias of combinatorial libraries. 

The definition of an electrostatic complement, i.e., the 
optimal tradeoff between unfavorable desolvation energy 

60 and favorable interactions in the complex, has been discov- 
ered to be useful in ligand design. This methodology essen- 
tially inverts the design problem by defining the properties 
of the optimal ligand based on physical principles. These 
properties provide a clear and precise standard to which trial 

65 ligands may be compared and used as a template in the 
modification of existing ligands and the de novo construc- 
tion of new ligands. 
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The electrostatic complement for a given target site is 
defined by a charge distribution which minimizes the elec- 
trostatic contribution to binding at the binding sites on the 
molecule in a given solvent. One way to represent the charge 
distribution in a computer system is as a set of multipoles. 
By identifying molecules having point charges that match 
this optimum charge distribution, the determined charge 
distribution may be used to identify ligands, to design drugs, 
and to design combinatorial libraries. 

Referring now to FIG. 1, one embodiment of the present 
computer-implemented process is shown. This embodiment 
may be implemented using one or more computer programs 
on a computer system, an example of which is described 
below. Given a definition of a molecule for which a ligand 
is to be designed, indicated at 30, a molecular analysis tool 
32 provides a possible conformation, or shape, of the 
molecule as indicated at 34. There are several systems 
available to provide such conformations, including but not 
limited to x-ray crystallography, homology modeling, 
nuclear magnetic resonance imaging or analytical tech- 
niques such as shown in Kuntz and DesJarlais. The desired 
binding or active points on the molecule, indicated at 36, and 
a desired ligand shape for binding with the molecule at the 
indicated binding points, as indicated at 38, also are input to 
the computer system. 

An electrostatic continuum analyzer 40, described in 
more detail below, is used to determine a charge distribution 
which minimizes the electrostatic contribution to binding at 
the binding sites in a given solvent, given the representation 
of the shape of the molecule in three dimensions, the binding 
sites on the molecule defined by locations in three dimen- 
sions and the desired ligand shape, also defined in three 
dimensions. Accordingly, the output of analyzer 40 is a 
representation of a charge distribution minimizing electro- 
static contribution to binding as indicated at 42. 

The charge distribution 42 is used in combination with 
candidate ligands having the desired ligand shape, as indi- 
cated at 44. A candidate ligand shape and point charge 
analyzer 46 determines which candidate ligands have a 
charge distribution closest to the optimal charge distribution 
42. Analyzer 46 outputs candidate ligands for the binding 
site as indicated at 48. Similarly, a screening system 50 may 
also be used to screen candidate ligands 44 for their prox- 
imity to the optimum charge distribution indicated at 42 in 
order to develop a combinatorial library 52. Such a combi- 
natorial library may be used to develop more complex 
molecules having desired characteristics. 

Referring now to FIG. 2, a suitable computer system 60 
typically includes an output device 62 which displays infor- 
mation to a user. Hie computer system includes a main unit 
61 connected to the output device 62 and an input device 64, 
such as a keyboard. The main unit 61 generally includes a 
processor 66 connected to a memory system 68 via an 
interconnection mechanism 70. The input device 64 is also 
connected to the processor 66 and memory system 68 via the 
connection mechanism 70, as is the output device 62. 

It should be understood that one or more output devices 
may be connected to the computer system. Example output 
devices include a cathode ray tube (CRT) display, liquid 
crystal displays (LCD), printers, communication devices 
such as a modem, and audio output. It should also be 
understood that one or more input devices may be connected 
to the computer system. Example input devices include a 
keyboard, keypad, track ball, mouse, pen and tablet, com- 
munication device, audio input and scanner. It should be 
understood present computer-implemented process is not 
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limited to the particular input or output devices used in 
combination with the computer system or to those described 
herein. 

The computer system 60 may be a general purpose 

s computer system which is programmable using a high level 
computer programming language, such as "C", "Fortran," or 
"Pascal." The computer system may also be specially 
programmed, special purpose hardware. In a general pur- 
pose computer system, the processor is typically a commer- 

10 cially available processor, of which the series x86 
processors, available from Intel, and the 680X0 series 
microprocessors available from Motorola are examples. 
Many other processors are available. Such a microprocessor 
executes a program called an operating system, of which 

!5 UNIX, DOS and VMS are examples, which controls the 
execution of other computer programs and provides 
scheduling, debugging, input/output control, accounting, 
compilation, storage assignment, data management and 
memory management, and communication control and 

20 related services. One embodiment was implemented using a 
Hewlett-Packard 9000/735 computer with a PA-7200 (99 
MHz) chip. The processor and operating system define a 
computer platform for which application programs in high- 
level programming languages are written. 

25 A memory system typically includes a computer readable 
and writeable nonvolatile recording medium, of which a 
magnetic disk, a flash memory and tape are examples. The 
disk may be removable, known as a floppy disk, or 
permanent, known as a hard drive. A disk has a number of 

30 tracks in which signals are stored, typically in binary form, 
i.e., a form interpreted as a sequence of one and zeros. Such 
signals may define an application program to be executed by 
the microprocessor, or information stored on the disk to be 
processed by the application program. Topically, in 

35 operation, the processor causes data to be read from the 
nonvolatile recording medium into an integrated circuit 
memory element, which is typically a volatile, random 
access memory such as a dynamic random access memory 
(DRAM) or static memory (SRAM). The integrated circuit 

40 memory element allows for faster access to the information 
by the processor than does the disk. The processor generally 
manipulates the data within the integrated circuit memory 
and then copies the data to the disk when processing is 
completed. A variety of mechanisms are known for manag- 

45 ing data movement between the disk and the integrated 
circuit memory element, and the present process is not 
limited thereto. It should also be understood that the present 
process is also not limited to a particular memory system. 
It should be understood the present computer- 

50 implemented process is not limited to a particular computer 
platform, particular processor, or particular high-level pro- 
gramming language. Additionally, the computer system 60 
may be a multiprocessor computer system or may include 
multiple computers connected over a computer network. 

55 Defining Ligand Properties 

The process for defining complementary ligand properties 
of electrostatic interactions, using such continuum calcula- 
tions is outlined in FIG. 3. Because of the exchange nature 
of electrostatic interactions, seemingly "strong" electrostatic 

60 attractions found in the bound state frequently destabilize 
the binding equilibrium, but presumably contribute to speci- 
ficity. That is, because of the substantial desolvation penalty 
incurred for burying polar and charged groups, their net 
electrostatic contribution to macromolecular association is 

65 generally unfavorable. In designing a ligand for a given, 
fixed target that has polar and charged groups at the site, it 
is important to balance the desolvation and interaction 
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energies so as to contribute most favorably to binding or at 
least to provide the smallest possible destabilization. The 
following method solves this problem analytically, using 
idealized geometries and a continuum electrostatic model, 
and provides a single, unique optimum which is solved 
exactly. 

For the case of binding a spherical ligand to an arbitrarily 
shaped receptor to form a spherical complex, the free energy 
of binding is expressed in terms of the charge multipoles of 
the ligand. By minimizing the binding free energy with 
respect to the multipoles, (i) there is a single, optimal 
multipole distribution defining the tightest binding ligand 
for the given geometry, (ii) this multipole distribution cor- 
responds to a minimum in AG^^-^, (iii) at this optimum the 
magnitude of the ligand desolvation penalty is exactly half 
the magnitude of the favorable intermolecular electrostatic 
interactions in the complex, and (iv) the loss in binding free' 
energy for a sub-optimal charge distribution is easily calcu- 
lated by comparing to the optimum. This minimization of 
the binding free energy with respect to the multipoles 
provides a clear and unambiguous route from the continuum 
model, an accepted energetic description of macromolecules 
and ligands to a set of descriptors, i.e., the multipoles, for the 
optimal ligand. For this method to be broadly applicable, 
any requirement for spherical geometries is removed. 
Accordingly, macromolecules and ligands are of arbitrary 
shape and are treated as such. 

In the spherical case, a variational binding energy for 
optimization, is defined as follows: 



unbound 



CO 



This includes three terms, which are discussed separately 
here. The first is the ligand-receptor screened interaction 
energy, which includes a contribution from the interaction of 
each multipole component of the ligand charge distribution 
with all point charges in the receptor. These contributions are 
accounted for by coefficients, the a tim , which are computed 
analytically, and the ligand multipoles (Q 1 /,™), 
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The second term is the bound-state reaction-field energy due 
to the ligand charge distribution, G bamd /9dJL . It has contri- 
butions from all pairs of multipole components with the 
same value of m, since the ligand multipole distribution is 
generally expanded about a point that is not the center of the 
spherical boundary in the bound state but the geometry is 
chosen with azimuthal symmetry, 
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Combining the above three equations, 
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The third term is the unbound-state solvation energy, which 
involves a contribution from each multipole component. 
Because the multipole expansion is taken about the center of 
the ligand sphere, and due to the orthogonality of the 
spherical harmonics, all cross-terms cancel, giving 
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and transforming to matrix notation, one completes the 
square and solves for the Q? opt ttnl giving the optimal variation 

10 binding energy. Since terms neglected from the variational 
binding energy are constant for a given geometry, these 
describe the multipoles of the optimal binding ligand. A 
more detailed exposition of this process is set forth in the 
article by L. P. Lee and B. Tidor, J. Chem. Phys., 

15 106:8681-8690, (1997), which is expressly incorporated 
herein by reference and part of which can also be found in 
the Appendix. 

An implementation of a computer program to perform the 
kinds of processing outlined in Lee and Tidor, supra and in 

20 the Appendix, can receive as an input the value which 
determines the size of the matrix of equations 59 and 61 (see 
Appendix), \ cun which truncates the innermost summation in 
equations 25 and 46 (see Appendix), the geometry of the 
problem, which indicates the shape of the target and the 

25 ligand, and whether the monopole of the optimum is to be 
free or fixed at some value. The geometry of the problem 
includes the radius and coordinates of the center of both the 
bound state and ligand spheres on the z axis and the 
coordinates of magnitude of each partial atomic charge in 

30 the system. The dielectric constants G a and G 2 are deter- 
mined by the particular problem. Evaluation of the a*, p, y 
and y £ values is carried out, followed by solution of matrix 
equation 59 or 61 (see Appendix), for example by using LU 
decomposition. The eigen values of the B matrix may be 

35 obtained to verify that the stationary point was a, minimum. 
All real floating point values may be represented, for 
example, using 64 bits or other suitable format The com- 
putation of the matrix algebra may be accomplished using 
available or increased precision versions of appropriate 

40 subroutines, such as defined in Press et al, Numerical 
Recipes in C: Hie Art of Scientific Computing, Cambridge 
University Press, Cambridge, 1988. The output of the pro- 
gram when executed is a representation of the optimal 
charge distribution (e.g., using multipoles), the nature of the 

45 stationary point and a file recording the alpha, beta and 
gamma values. Because a direct method, i.e., LU 
decomposition, was used to solve a matrix equation, the time 
scales as (l max ) 6 and the memory used scales as (Xnca^- This 
program output may be improved by accounting for the 

50 particularly sparse matrix in the matrix equation. Optimiza- 
tions also may be provided by solving the matrix equation 
with iterative methods such as the conjugate gradient 
method or various relaxation methods. This method has been 
implemented and tested using both a highly symmetric 

55 charge distribution and the terminus of an alpha-helix as the 
receptor. 

This method is extended to arbitrarily shaped molecules, 
by using iterative numerical computation to calculate the 
60 corresponding matrix coefficients and, for efficiency, by 
using a number of centers dispersed through the ligand 
volume at which individual multipole expansions are 
located. 

When this method is extended to non-spherical 
65 geometries, it takes the following form. The a /#m retain the 
same character, the P/^r^ become P/, mr ^. because azi- 
muthal alignment can no longer be used and the y irm become 
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Y/^/^i' because the orthogonality of the spherical harmonics 
does not eliminate the cross-terms for a non-spherical sur- 
face. Thus, a very similar matrix equation is found, 
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which is solvable by the same matrix methods used for the 
spherical case or by singular value decomposition using 
available or improved precision versions of appropriate 
subroutines, such as defined in Press et al, supra. However, 
numerical computations may be used to calculate the cor- ^ 
responding matrix coefficients. For the spherical case above, 
closed form expressions may be derived for rapid compu- 
tation. When the same matrix coefficients (a />m , P/^/^ and 
Y/^i) 2116 computed using iterative numerical methods, the 
computing requirements increase substantially. 

Alternatively, the ligand may be described by using more 20 
centers, each described by a small number of multipoles. In 
the extreme, each ligand can be composed on point-charge 
locations, and currently 500 would be affordable, i.e., com- 
putable in under three weeks of computer time, with l ma3C =0 
at each center. It is likely that the best solution will be 25 
intermediate, in which there are roughly 10 locations with 
(monopole, dipolar, and quadrupolar terms) or so at 
each center. The distributed centers of low-order multipoles 
may be an efficient and accurate way to describe arbitrary 
ligand charge distributions. The method has been properly 30 
elaborated for inclusion of interactions between separate 
multipole centers, and results using spherical geometries 
indicate that using two multipole distributions rather than 
one allows an equivalent description of the optimal charge 
distribution to be achieved using roughly one-quarter the 35 
number of multipole components and thus essentially one- 
quarter the time. 

Two alternative schemes may be used for iterative 
numerical computation of matrix coefficients. The first 
scheme is a modification of a finite-difference Poisson- 40 
Boltzmann (FDPB) solver, such as DELPHI (I. Klapper, R. 
Hagstrom, R. Fine, K. Sharp, and B. Honig). Focusing of 
electric fields in the active site of Cu — Zn superoxide 
dismutase: Effects of ionic strength and amino-acid modi- 
fication. Proteins: Struct, Funct., Genet. 1: 47-59 (1986), 45 
M. K. Gilson, K. A. Sharp, and B. H. Honig. Calculating the 
electrostatic potential of molecules in solution: Method and 
error assessment. J. Comput. Chem. 9: 327-335 (1988) and 
UHBD (B. A. 

Luty, M. E. Davis, and J. A. McCammon. Solving the 50 
finite-difference non-linear Poisson-Boltzmann equation. J. 
Comput. Chem. 13: 1114-1118 (1992), M. Zacharias, B. A. 
Luty, M. E. Davis, and J. A. Mcammon. Poisson-Boltzmann 
analysis of the X repressor-operating interaction. Biophys. J. 
63: 1280-1285 (1992)), and the second scheme is a modi- 55 
fication of boundary-element methods (BEM) (R. J. Zauhar 
and R. S. Morgan. The rigorous computation of the molecu- 
lar electric potential. J. Comput. Chem. 9: 171-187 (1988), 
R. Bharadwaj, A Windemuth, S. Sridharan, B. Honig, and 
A. Nicholls. The fast multipole boundary element method 60 
for molecular electrostatics: An optimal approach for large 
systems. J. Comput. Chem. 16: 898-913 (1995)). These 
modifications allow point multipoles, as opposed to just 
point charges, to be represented. 

Thus, a more complex method includes the following. For 65 
each pole component at each center iterative continuum 
calculations are carried out to determine its screened cou- 
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lombic interaction with the receptor point charges 
(essentially a, m ), its interaction with its own reaction field 
and that of each of the other pole components in the system 
in the bound (essentially P/^,/^) and unbound (essentially 
Y/^nAmO state. It is estimated that a ligand represented by 99 
pole components (such as 11 centers with 1^=2 at each) 
will require under three weeks of CPU time. For many 
applications, half that number of centers and half the time 
could be sufficient. A multipole distribution about a single 
center uses many global terms to accurately describe a 
complex charge distribution fairly far from the center. By 
distributing in space a number of centers for the expansion, 
an equally accurate description can be obtained with many 
fewer, somewhat more local, terms. 
Using Molecular Descriptors to Discover Ligands 

Referring again to FIG. 1, the charge distribution 42 
defined by the above procedure may be used to determine 
which candidate ligands would have a charge distribution 
closest to the optimum. The descriptions of the charge 
distribution and molecular shape can be used to construct 
ligand structures de novo, or they can be used to screen 
compound databases, or they can be used in the design or 
bias of combinatorial libraries. 

In the process of discovering ligands, detailed point- 
charge distributions are fit to the multipole distributions 
determined using the above methods. Next, molecular frag- 
ments and/or molecules are fit to the point-charge distribu- 
tions. Finally, both the point charges and the fragments may 
be used in the design of combinatorial libraries, described 
below. 

A least squares fitting procedures may be used to define a 
point-charge distribution that is a close fit to the multipole 
distribution describing the optimum. For example, a regular 
cubic lattice of grid points with roughly the spacing used in 
FDPB computations may be used. This can be achieved 
using the same tri-linear function used in FDPB codes to 
carry out the mapping in the opposite direction arbitrary 
point charges mapped to charged lattice points. (See 
Klapper.) Whether a set of point charges can provide an 
adequate fit to the electrostatic charge distribution repre- 
sented by the multipoles, can be determined by comparing 
the decrease in free energy of binding due to using the fit 
point-charge distribution in place of the multipoles them- 
selves. A trial using a cubic lattice of grid points with 0.5-A 
spacing indicates that the computed loss in binding energy 
is less than 0.001 kcal/mol due to fitting point charges. In 
addition, the resulting point charges assigned are reasonable 
in magnitude (nearly all are less than O.lOe), making a fit to 
molecular fragments plausible. In this embodiment the 
multipoles, which are a somewhat non-local description of 
the charge distribution, are converted into a local grid based 
point-charge description so molecules can be fit. 

The effectiveness of set points in fitting charges may be 
measured not only by minimizing the loss in binding energy, 
but also by how simply molecules or molecular fragments 
may be constructed from the point charges. The cubic lattice 
is used as described above to fit functional groups and 
molecules. A more molecule-based grid may also be used 
and may include connectivity for the common valencies (sp 
sp 2 , and sp 3 ) co-embedded. Additionally, a uniform density 
of point charges may be a disadvantage, rather, having a 
higher density of point charges near the ligand surface may 
provide a more effective fit. 

Given a point-charge distribution, it may be fit to a 
molecule or molecular fragment in several ways. For 
example, a database of molecular fragment geometries and 
point-charge distributions (such as a library derived from the 
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PARSE parameter set of fragments (D. Sitkoff, K Sharp, should contain. They may serve as useful scaffolds or seeds 
and B. Honig. Accurate calculation of hydration free ener- upon which further computational molecular design should 
gies using macroscopic solvent models. J. Phys. Chem. 98: be carried out or about which a synthetic combinatorial 
1978-1988 (1994)), may be used to match individual tunc- strategy could usefully be built. If they bind tightly enough, 
tional groups to favorable locations on the point-charge 5 they may be particularly useful therapeutics because it may 
distribution. This matching process could be a very large be very difficult for the virus to evolve resistance to a small 
scanning search if each fragment needed to be attempted at ligand targeted to catalytic side chains, 
all locations and in every orientation in the ligand volume. Designing Combinatorial Libraries 

The timing may be improved substantially, though, using a The design of combinatorial libraries as nlustrated at 50 
regular cubic lattice for the point-charge distribution. Each W in FIG. 1, will now be described in more detail Although 
fragment would only need to be scanned over a relatively there is substantial long-standing interest m using compu- 
small section of lattice to determine sets of lattice point tational molecular modeling to carry out de novo rational 
charges "diagnostic" for it. These diagnoses may be com- ligand design, there are other ways in which this method can 
piled for all library fragments, for example, in a hash table, be used for ligand discovery. In particular, this method can 
and clusters of point-charge values may be used to query the 15 be used to define a relatively narrow region of chemical 
hash table and fit fragments to the charge lattice. So long as space, and a combinatorial library can be designed to search 
the same grid spacing is maintained, the hash table may be that limited space particularly thoroughly. Given the finite 
reused for many different targets and optimizations. synthetic capacity of even the most ambitious combinatorial 

After fragments have been placed, the problem of fitting chemistry schemes, this mechanism can channel synthetic 
them together into molecules is similar to the one addressed 20 diversity into higher probability directions, 
by the MCSS algorithm described above, although the Again, there are several alternative implementations for 
theoretical foundations for choosing fragment locations are this computational method. One implementation begins with 
very different in that method and in the present computer- detailed grids of point charges fit to the optimal multiplies 
implemented process. Two solutions developed there may be and segregates the grid into regions of space corresponding 
adapted for use here. In the HOOK method described above, 25 to pockets appropriate for receiving one or more functional 
a database of small molecules is used as linkers to fit groups. The shape and point charges are then used to assign 
fragments together, generally trying to introduce rigidity at the general size and character, e.g., positively charged, 
the same time. In the dynamic ligand method (DLD) negatively charged, highly polar, moderately polar, weakly 
described above, a sea of carbon atoms is superposed with polar, or hydrophobic. These property definitions may be 
the fragments and a simulated annealing procedure is used 30 used to bias combinatorial synthesis towards generating 
in which the occupancy of each carbon can grow and shrink appropriate ligands. 

and in which bond-making and bond-breaking events are Having described the computational aspects of the present 
used to coalesce novel carbon linkers. In each method, each computer-implemented process some biological model sys- 
fragment generally is allowed to move somewhat in order to terns will now be described, 
create relatively unstrained ligands. An accurate penalty 35 Biological Model Systems 
function for movement based on how movement affects the EXAMPLE 1 

computed binding energy may be used. The DLD based 

approach may be better because of its flexibility. Class H Major Histocompatibility Complex (MHC) 

Alternatively, molecules may be grown in a sequential Proteins 
fashion so as to fill the ligand volume and fit the point- 40 Introduction r\/iun\ 
charge distribution. A straightforward scheme involves plac- The major histocompatibility complex proteins (MHC) 
ing a single fragment at a location where it fits the point- are cell-surface antigen presenting structures whose role is 
charge field and executing a search for other fragments that to display a sample of proteolized intracellular peptide to T 
can be joined to the first, adjusting their relative orientation cells. Recognition of a peptide as "foreign" by a T cell 
via the connecting torsion. This procedure can be carried out 45 induces an immune response. This response mcludes killing 
in a tree-like manner to create large numbers of ligands. An the antigen-presenting cell (class MHC, usually) or secreting 
appropriate figure of merit or distance metric, is applied to lymphokines that control attack by various elements of the 
determine whether to accept or reject each new fragment. A immune system, including B cell activation (class II MHC, 
potential that includes van der Waals and torsional terms as usually). Because each individual has a limited number of 
wen as a fit to the volume and charge distribution of the 50 histocompatibility proteins and a virtually uruumted number 
defined optimum may be used in this method. of peptides to present, each MHC molecule is capable of 

Yet another alternative is the design of "mmimalist" presenting a wide variety of peptides. Structural studies have 
ligands. The multipole distributions of the optima may be fit revealed separate mechanisms used by class I and class II 
with as few point charges as possible. This optimization MHC molecules for achieving high affinity yet fairly low 
process involves finding a relatively small number of point 55 specificity (L J. Stem and D. C. Wiley, Structure 2:245 
charges whose computed binding energy is within a few (1994)). 

tenths of a kcal/mol of the optimum. Studies with comple- The structure of the HLA-DR1 class II MHC protein 
mentary nucleotide bases suggest that a better complemen- . complexed with a peptide from influenza virus has been 
tary "base" than that used by nature can be reconstructed solved (L. J. Stern, et al., Nature (London) 368:215 (1994)). 
using only one-third the number of pont charges, i.e., a 60 The original structure of HLA-DR1 in complex with influ- 
complement to adenine can be constructed using only four enza hemagglutinin residues 306-318 
point charges; and this complement binds tighter than (PKYVKQOTLKLA3) elucidated a number of important 
adenine. These reduced point-charge ligands retain the features of binding and recognition that have been confirmed 
Watson-Crick hydrogen bonding to the partner, although in in other class H MHC complexes. The protein is comprised 
somewhat different fashion. Models for ligands containing 65 of an eight-stranded beta-sheet with two immunoglobulin- 
very few required point charges may represent the key like domains on the cell-surface side and a pair of alpha- 
compensating interactions that a more elaborate ligand helices on the extracellular side. The peptide-binding site is 
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a cleft between the two helices and supported by the 
beta-sheet. The peptide binds in an extended but highly 
twisted conformation, similar to the type II polyproline 
helical conformation; the N- and C-termini extend outside of 
the site. Most of the hydrogen bonds between peptide and 
protein (12 of 15) are to peptide backbone groups, which 
helps to explain how the protein recognizes many different 
peptides. The observed peptide conformation forces each 
peptide side chain into one of three directions: 5 of the side 
chains (Y308, Q311, T313, L314, and L316) are directed 
into pockets in the surface of the MHC molecule and are 
essentially buried, 6 of the side chains (K307, V309, K310, 
NF312, K315, and T318) are directed out away from the 
binding site and toward the T cell receptor, and the remain- 
ing 2 side chains (P306 and A317) are directed across the 
site. Thus, residues making extensive contact with the MHC 
are, for the most part, distinct from those poised to interact 
with approaching T cell receptors. Of the 5 pockets, the 
deepest accommodates Tyr308, though binding studies 
shown that tyrosine, phenylalanine, or tryptophan are all 
allowed. Different class II MHC alleles incorporate substi- 
tutions at the 5 pockets that receive the 5 buried side chains. 
It is thought that alterations in these interactions are respon- 
sible for allotypic differences in peptide specificity. Because 
individuals differ in their allotypic complement of MHC 
molecules, individuals differ in the profile of their immune 
response. 

The relative affinity of peptides for binding to individual 
class II MHC molecules is thought to be responsible for 
relative peptide antigenicity. Phage display selection and 
amplification studies have defined the frequency with which 
each amino acid is found at individual positions in high- 
affinity peptides (J. Hammer, et al., J. Exp. Med. 176:1007 
(1992) and J. Hammer, et al., PNAS U.S.A. 91: 4456 
(1994)). The strongest anchor position was determined to be 
a large aromatic at position 1, which was found as Tyr 
(48%), Phe (25%), or Trp (13%) predominantly. Position 4 
was a long hydrophobe, found as Met (50%) and Leu (28%); 
position 6 was a small residue, found as Ala (32%) and Gly 
(22%); and position 9 was generally found as Leu (45%) .(J. 
Hammer, et al., J. Exp. Med. 176:1007 (1992)). Also, there 
were very few negatively charged side chains recovered at 
any position. This data provides a useful semi-quantitative 
set of relative affinities that are useful for validating the 
computational methodology of the present process. 
Testing and Validation 

The class II MHC HLA-DR1 system is used to test the 
computational methodology disclosed herein to analyze the 
peptide-binding site, and to design enhanced-binding mol- 
ecules. Testing and validation consists of a number of tasks, 
initially using the crystal structure with bound viral peptide 
(L. J. Stem, et al., Nature (London) 368:215 (1994)). These 
tests are designed to confirm that the methods are capable of 
(i) recognizing that the observed bound peptide is a good 
binder, (ii) recognizing that known deleterious peptide muta- 
tions are unfavorable, (iii) recognizing that known 
enhanced-binding peptide mutations are favorable, (iv) 
reproducing the known pattern of binding hydrophobic, 
polar, and charged residues in individual surface pockets, 
and (v) regenerating the known peptide backbone confor- 
mation and contacts. 
Analysis of Bound-Peptide Complex 

The binding of viral peptide to HLA-DR1 is analyzed 
using the methods disclosed herein. Briefly, the strategies 
disclosed herein are used to regenerate the bound peptide's 
charge distribution. A variable point charge is placed on each 
atom of the peptide and the charge values are computed that 



16 

optimize the free energy of binding. By comparing these 
point charges to the actual point charges, the reduction in 
a ffini ty of the peptide compared to the calculated optimum 
can be computed. Discrepancies from point charges between 

5 the calculation and the actual point charges of the viral 
peptide suggest that the possibility exists to design an 
enhanced-binding ligand. In this manner, this set of tests is 
used to confirm the asserted utility of the claimed methods 
with respect to designing an enhanced-binding ligand based 

10 upon the structure of a known ligand and its binding partner. 
Enhanced- and Reduced-Binding Mutations 

Tests of relative afifinity are performed initially using 
isosteric or near-isosteric replacements. From the data of 
Hammer et al. using phage display studies, Tyr is preferred 

15 over Phe at position 1, and Met or Leu is preferred over Gin 
at position 4, where the underlined residue corresponds that 
in the bound peptide structure (L. J. Stem, et al., Nature 
(London) 368:215 (1994); J. Hammer, et al., J. Exp. Med. 
176:1007 (1992); and J. Hammer, et al., PNAS U.S.A. 

20 91:4456 (1994)). The methods disclosed herein are used to 
compute the change in affinity due to these mutations. 

The novel strategy disclosed herein for ligand design is to 
start with a given conformation of receptor (or other target 
molecule, such as an antibody or an enzyme) and find the 

25 properties of the ligand that will optimally complement that 
conformation. The tests performed described herein assay 
whether the methods can detect differences in affinity due to 
differences in the ligand charge distribution, an essential 
prerequisite for defining the optimal ligand charge distribu- 

30 tion. When the point-charge magnitudes are optimized as 
described in the previous paragraph, it is expected that the 
polarity assigned to the Tyrl hydroxyl remains, that of Val2 
increase, and that of Gln4 and Thr6 decrease, reflecting the 
electrostatic tendencies of Hammer et al. (J. Hammer, et al., 

35 J. Exp. Med. 176:1007 (1992); and J. Hammer, et al., PNAS 
U.S.A 91:4456 (1994)). 
Pattern of Polar and Non-Polar Side Chains 

The methods of the present computer-implemented pro- 
cess are used to probe the peptide binding site without 

40 reference to known positions of peptide atoms. This probing 
is done in two modes. In one mode, each of the five major 
binding pockets is probed through individual optimization 
of the charge distribution in that site only; in the second, the 
five major binding pockets are probed together, with the 

45 charge distribution for the entire site optimized in one 
computation. A comparison of the results indicates the 
extent to which the sites are coupled; experimental work 
suggests that the coupling should be minimal (J. Hammer, et 
al., J. Exp. Med. 180:2353 (1994)). A complementary 

50 shaped region is constructed through sphere packing and the 
multipolar charge distribution that optimizes binding to the 
site is computed. Both through direct examination of the 
multipoles and by constructing a gridded point-charge dis- 
tribution complementary to the site, each site is categorized 

55 as to how well it accepts hydrophobic, polar, positive, and 
negative groups. Examination may reveal mixed character, 
where a site is largely hydrophobic but accepting of some 
localized polar groups (presumably the Tyrl site is of this 
type). Comparison with the known site characteristics is 

60 done to evaluate the results. A discrepancy may result if the 
peptide desolvation penalty used in the computation (that 
which would result from a rigid peptide in the bound 
conformation) were substantially different from that expe- 
rienced by actual ligands in phage-display studies. However, 

65 we do not anticipate this discrepancy to be of concern 
because the desolvation penalty is dominated by polar and 
charged groups, which should be exposed to solvent in the 
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unbound state and which should be in the observed extended 
and twisted conformation. 
. Backbone Trace and Contacts 

Because the backbone trace is thought to be invariant for 
essentially all peptides that bind, one expects the site to 5 
strongly dictate backbone contacts. Accordingly, the meth- 
odology of the present computer-implemented process is 
used to regenerate the position of the backbone observed 
crystallographically to further validate the methods dis- 
closed herein. Using the above-described methods, the dis- 10 
tributed multipole description of the optimal ligand in the 
region of peptide backbone binding is identified, converted 
to a gridded point-charge field, and the peptide amide groups 
(N-methyl acetamide) are fitted into the charge field as a 
least-squares fit while also not allowing steric overlap with is 
the walls of the site. 
Design for Enhanced Binding 

In general terms, the strategy for designing enhanced 
binding ligands is used to locate opportunities where known 
ligands do not take full advantage of the site. To this end, 20 
both individual chemical groups that pay more in desolva- 
tion energy than they recover in favorable interactions and 
also sites where current liganding groups fall short of 
computed optima are identified. The computations carried 
out above (Testing and Validation) are re-analyzed in search 25 
of such opportunities. 
Analysis of Bound-Peptide Complex 

The complete electrostatic dissection described above is 
used to detect functional groups (side chains or backbone 
dipolar groups on both the peptide and the binding site) 30 
whose total electrostatic contribution to binding is unfavor- 
able (that is, whose mutation to a hydrophobic group is 
computed to lead to tighter binding). This electrostatic 
dissection suggests targeting regions of the peptide (even if 
they are backbone) for modification to hydrophobic groups 35 
to produce a more stable complex. Using this strategy we 
were able to identify three stabilizing mutations in a variant 
of the Arc repressor (Z. S. Hendsch, et al., Biochemistry 
35:7621 (1996)). An MHC protein group that contributes 
unfavorably to binding can be ameliorated by modifying the 40 
peptide to make improved interactions with it. These oppor- 
tunities can be confirmed by a number of parallel studies, 
including the computation in which the point charges of the 
viral peptide atoms are re-optimized (see above). The same 
locations for reduction and increase in the polarity of the 45 
ligand should be found. Such parallel confirmation is used to 
provide further evidence that a proposed site can be modi- 
fied to enhance binding. 
Pattern of Polar and Non-Polar Side Chains 

It is anticipated that the individual pocket optimizations as 50 
well as that of the whole site can be used as the source of 
suggested detailed changes for the purpose of identifying 
ligands with enhanced binding to its target molecule. In 
choosing locations for such optimization, regions where the 
largest free energy gains can be. recovered as measured by 55 
discrepancy between actual and optimized charge distribu- 
tion and corresponding binding energy are initially selected. 
Three such regions include: the position 1 binding pocket, 
the peptide-backbone binding area (see below), and a pocket 
occupied by a solvent cluster in the viral-peptide study (L. 60 
J. Stem, et al., Nature (London) 368:215 (1994)). Position 1 
accommodates a Tyr in the viral-peptide complex (L. J. 
Stem, et al., Nature (London) 368:215 (1994)) but is fre- 
quently found as Phe or Tip as well (J. Hammer, et al., J. 
Exp. Med. 176:1007 (1992); and J. Hammer, et al., PNAS 65 
U.S.A 91:4456 (1994). In a recently determined crystal 
structure of HLA-DR1 with a different peptide bound, Trp 
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occupies the site. The orientation of the Trp side chain is 
roughly 90° rotated with respect to the Tyr, yet the surround- 
ing protein pocket is essentially unchanged. Placing a large 
hydrophobic side chain in this pocket appears virtually a 
requirement for binding J. Hammer, et al., PNAS U.S.A. 
91:4456 (1994)). The optimized charge distribution gener- 
ated for groups binding to this pocket can be used as a guide 
to a combinatorial synthetic scheme to synthesize enhanced 
binding ligands. 
Backbone Trace and Contacts 

The present computer-implemented process can be used 
to design ligands having non-peptide backbones for 
improved binding. By comparing the optimized charge 
distributions for the backbone-binding region to the peptide 
charge distribution, improved backbone chemistries can be 
rationally designed. For example, the method can be used to 
identify ligands that have the equivalent of an alpha-carbon 
(or at least a beta-carbon) so permit attachment of the 
presented side chains onto the T-cell side of any new 
platform is designed. 



EXAMPLE 2 
fflV Protease 

Introduction 

The protease from HIV is required for proper assembly of 
virus. Inactivation of the protease by mutation leads to the 
production of non-infectious particles. Design of JflV- 
protease inhibitors has been a major effort of a number of 
pharmaceutical companies for the past decade or more. This 
research was aided by the facility with which high- 
resolution X-ray crystallographic data could be obtained 
after proper conditions were worked out for expression, 
purification, and crystallization. In the Protein Data Bank 
there are currently over 45 structures of HIV-1 protease 
either alone or in complex with ligand. These structures 
provide a rich data set for examining modes of interaction of 
different ligands with a common protein. A number of very 
promising inhibitprs have already been developed, some are 
in clinical trials, and a few have been approved by the FDA. 
Nevertheless, "escape" mutants of the protease have been 
isolated for a number of these inhibitors. 

The protease structure reveals an essentially symmetric 
homodimer of a 99-residue polypeptide chain. The active 
site is formed at the two-fold axis, is enclosed by a pair of 
symmetry-related loops that appear highly flexible in the 
unbound state but close over the active site upon ligand 
binding, and adjoins a cleft that can bind substrates up to 
seven residues long. The active site contains the triad Asp25, 
Thr26, and Gly27 from each subunit, with the pair of Asp25 
carboxylate groups in close proximity and nearly coplanar. 
The apparent exact two-fold symmetry of the enzyme in the 
absence of ligand is disrupted somewhat by binding 
(asymmetric) peptide ligands. One particularly interesting 
issue in design studies has been whether asymmetric ligands 
(such as those modeled on peptide substrates) or symmetric 
ligands (which have the opportunity to bind with the ligand 
two-fold axis coincident with the enzyme two-fold) are 
tighter binding. A surprising result has been that certain 
symmetric ligands are found to bind asymmetrically in the 
active site. The computational methods disclosed herein can 
be used to investigate the energetic contributions to this 
difference. 

Substrate specificity studies have been used to determine 
binding preferences for peptides. These have revealed affin- 
ity for Gin or Glu at the P2' position and a largely hydro- 
phobic side chain (Phe, Leu, Met, Asn, or Tyr) at PI. Less 
pronounced preferences include Glu at P3 and a hydrophobe 
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at P2 (A. Wlodawer and J. W. Erickson, Annu. Rev. Bio- 
cheni. 62:543 (1993)). 

The application of the methods disclosed herein to the 
analysis of binding modes of molecules whose involvement 
is critical to HIV infection can be used to facilitate the design 5 
of tight-binding ligand molecules for use as diagnostic and 
therapeutic agents. In general, the me±ods of the present 
computer-implemented process are primarily continuum 
electrostatics and secondarily free energy simulations. The 
process provides a novel method for finding the electrostatic 10 
complement of a target molecule. The preliminary results 
demonstrate that the computational modeling used herein for 
molecular and energetic dissection for a continuum analysis 
yield conclusions that are consistent with those found in by 
using a more detailed (and time-consuming) free energy 
simulation for a pair of studies on protein-DNA recognition 
by 434 repressor (see, example 3). 
Testing 

Initial testing of the fundamentals of the method are 20 
carried out in studies of the class II MHC molecule, and next 
carried out using the HIV-1 protease. Accordingly, the 
above-described methods are used to design enhanced- 
binding ligands that bind to the HIV-1 protease. One diffi- 
culty encountered with many ligand design protocols is the 25 
need to predict the conformation of bound complexes. The 
present computer-implemented process circumvents this 
problem by choosing a conformation of the protein and 
solving for a set of molecular descriptors for an optimally 30 
complementary ligand. The process also provides tools to 
examine a subset of the available structures of HIV-1 
protease, bom alone and in complex with various ligands. 

Loop Conformation 

Two symmetry related loops are in an open conformation 35 
in the unbound form of the enzyme and close down against 
the active site in the bound form. One set of inhibitors that 
is well characterized and is attractive due to their relative 
rigidity is the cyclic urea compounds being developed by 4Q 
DuPont Merck Pharmaceuticals (P. Y. S. Lam, et al., Science 
263:380 (1994) and C. N. Hodge, et al., Chem. & Biol. 3:301 
(1996)).- Members of this family of compounds can be used 
to analyze the bound-state structure. For example, the com- 
plex with XK 263 (a symmetric cyclic urea with two 45 
. naphthyl, two phenyl, and two hydroxyl substituents) is in 
the Protein Data Bank, and the complexes with DMP 323 
and DMP 450 are shown in FIG. 4 ("Inhibitors of HIV-1 
Protease"). The effect of this conformational change is 5Q 
examined on the computed properties of the optimal ligand 
using the above-described methods. The effect of receptor 
conformational change on complementary ligand properties 
is examined by characterizing the optima by the shapes and 
relative polarities of the moieties occupying individual sub- 55 
site pockets at the active site. It is anticipated that the 
differences in computations will be rather small, since the 
substrate must initiate binding with the loops in the open 
conformation and complete binding when the loops are 
closed. Substrates either represent some compromise 60 
between being complementary to the open and closed form, 
or there isn't substantial difference between the two. 
Protonation State 

One important and currently unresolved question central 55 
to the design of protease inhibitors is the protonation state of 
the catalytic aspartyl residues (Asp 25 and 25'). It is antici- 
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pated that the protonation state of this pair of side chains will 
substantially change the properties of the computed electro- 
static complement. It is potentially worthwhile for a ligand 
to incur greater desolvation penalty to interact with a 
charged, rather than an uncharged, aspartic acid. It is antici- 
pated that a comparison of the computed optimal ligand 
electrostatic properties to actual bound ligands will permit 
the assignment of protonation states to some of these com- 
plexes. The case of the cyclic ureas from the DuPont Merck 
group are useful in this study because NMR evidence is 
consistent with the aspartyl groups each being protonated 
(D. A. 

Torchia, et al., J. Am. Chem. Soc. 116:1149 (1989)). By 
comparing the optimal complements from computations 
using a doubly-, singly-, and unprotonated catalytic pair, the 
affect of the availability of chemical freedom in an active 
site on its ligand binding properties is determined. Such 
studies also permit the identification of a preferred titration 
state that is more susceptible to ligand binding than others. 
Symmetric and Asymmetric Binding 

A number of symmetric inhibitors have been designed 
based on the principle that they would be more complemen- 
tary to the symmetric enzyme (M. Miller, et al., Science 
246:1149 (1989)). Although some of these have been 
observed to binding symmetrically in crystallographic stud- 
ies (XK 263 (P. Y. S. Lam, et al., Science 263:380 (1994), 
DMP 450 (C. N. Hodge, et al., Chem. & Biol. 3:301 (1996)), 
and A-76928 (M. V. Hosur, et al., J. Am. Chem. Soc. 116:847 
(1994)), others bind asymmetrically (A-76889 (M. V. Hosur, 
et al., J. Am. Chem. Soc. 116:847 (1994)). There could be 
two reasons for asymmetric binding of a symmetric inhibi- 
tor. Either the site can deform so that it is truly complemen- 
tary to an asymmetric ligand, or the site can remain essen- 
tially symmetric but the ligand preferentially makes 
asymmetric interactions. These cases can be distinguished 
more precisely by examining the computed electrostatic 
complement for sites harboring symmetrically and asym- 
metrically bound ligands for two-fold symmetry. If the 
complement remains symmetric for asymmetrically bound 
ligands, improvements to the ligand can be defined using the 
above-described methods. For example, enhanced-binding 
ligands can be designed by studying the four compounds in 
FIG. 4 which represent different choices (both symmetric 
and asymmetric) for using hydroxyl groups to compensate 
the buried catalytic aspartic acid side chains. 
Design 

The approaches to design of protease inhibitors are simi- 
lar to those described above in reference to the design of 
MHC ligands. A few design points unique to HIV protease 
are described herein. 

Each of the above-described studies answers specific 
questions about how conformational and titration changes to 
active sites affect the properties of the computed comple- 
mentary ligand. Each study also can be analyzed for oppor- 
tunities to modify existing ligands (to obtain "enhanced- 
binding ligands' 1 ) or to design entirely new ligands with 
enhanced affinity. Alcohols and diols are prevalent in a 
number of HIV-1 protease inhibitors; more effective moi- 
eties to satisfy the electrostatic properties of the aspartyl 
groups can be identified to design enhanced-binding ligands. 

One particularly important problem with all drugs tar- 
geted to HIV is the eventual evolution of "escape" mutants. 
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The invention is useful for developing the minimal charge and individual phosphate groups as well as the strong 

configuration required to complement the active site resi- interaction between the backbone groups in a protein loop 

dues. It is believed that such a core molecule is useful with a set of phosphates. Additionally, some interactions 

because its limited size should reduce the number of con- between charged side chains and the DNA backbone are 

tacts potentially disruptable by escape mutants. In addition, 5 rather distant, but are directed through the low-dielectnc 

* ♦ m ii ^l,ui,c^ ^ protein, where electrostatic interactions might be expected 

since contacts would all be at the catalytic site, disrupting , . . , 

, , , . to be longer range due to less screening by solvent, 

mutants could be macuve^ Intennolecular interactions with the bases are small: 2.2 

Design Studies on Other HTV Targets kca]/mo i (unfavorable) with protein backbone and -14.6 

Design studies against other HIV tangete "eperfo^ed 10 " £ wk protein side chains, although the 

using the above-described methods. Other HIV targets ■ * ♦* n„ tu^^u* * rt ZZ n *~r 

• i j ^ nxr a i c T'A n j n t> T3 a ljt\/ base-side chain interactions are generally thought to confer 
include the RNA complexes of TAR and RRE and the HIV substantial specificity to pro tein-DNA complexes. 

envelope proteins. Interestingly, interactions close enough to make a hydrogen 

EXAMPLE 3 is b° nd account for roughly half the favorable intermolecular 

interactions; an equal contribution comes from interactions 
The 434 Repressor DNA-binding Domain too distant to be hydrogen bonding. In particular, many of 

Introduction these more distant interactions are to the "non-contacted" 

We have analyzed the high-resolution X-ray crystal struc- bases in the central region of the operator; Arg43 in the left 
hire of the 434 repressor DNA-binding domain, Rl-69, 20 m £ right half-sites contribute -3.9 kcal/mol. 
bound to the O^l operator using continuum electrostatic Overall the intramolecular interactions contribute only 8.8 

calculations. The principal results are outlined below. The kcal/mol to the electrostatic docking energy, but the sources 
interaction was dissected into contributions from each pro- Q f m is effect are quite interesting. These interactions exist in 
tein backbone carbonyl, C a NH, and side chain and each ^ identical geometry in the bound and free states since they 
nucleic-acid ribose, base, and phosphate. For each group a 25 are within the protein or DNA. Their magnitude changes on 
desolvation contribution to binding was calculated as well as docking, however, because the removal of high-dielectric 
contributions from new interactions made across the inter- solvent in the complex reduces the screening of interactions, 
face (termed "mtermolecular") and from changes in the Repulsions within the DNA backbone (due largely to 
screening within the protein or DNA (termed "intramolecu- 3Q phosphate-phosphate interactions) increase in magnitude by 
lar" interactions). 19.2 kcal/mol on binding protein because the reduced sol- 

Currently only the rigid binding of pre-conformed protein vent scre ening in the bound state leads to a lower effective 
dimer to pre-conforned DNA has been studied. These meth- dielectric. This is partially offset by a favorable contribution 
ods can be extended to address conformational flexibility as between protein side chains of -11.9 kcal/mol, which is due 
described above. The overall electrostatic contribution to 35 i arge iy' to attractive salt bridges within the protein whose 
binding is unfavorable (45.3 kcal/mol). This is due to a large strength "increases" due to reduced screening in the com- 
desolvation penalty (132.9 kcal/mol) that is only partially p i ex 

offset by favorable intermolecular terms (-96.4 kcal/mol). When all of the contributions (desolvation, 
The sum of intramolecular terms is small and unfavorable 4Q intermolecular, and intramolecular) are tabulated for. each 
(8.8 kcal/mol). Four salt bridges formed in the complex (two group, most groups individually pay more in desolvation 
symmetry-related pairs) stabilize complex formation by an energy than they recover in other interactions. This is 
average of -1.7 kcal/mol each. This is due largely to the fact particularly true for the phosphate groups and all but one 
that these groups incur a smaller desolvation penalty than do base, as we ]j ^ f or the side chains at the binding interface, 
protein side chains in folding from the unfolded state. In this 45 Groups that do recover more than they pay in desolvation 
regard, binding appears to be somewhat different from energy tend to be largely buried in the undocked state, 
folding, but our further results show that the distinction is j n summary, this work demonstrates the detailed insights 
somewhat more complex. that result from an energetic dissection of a binding event. 

The largest contributors to the desolvation penalty come ^ These techniques are useful for exploring ligand binding to 
from the charged groups in the system — protein side chains jjry targets, and permit the rational design of enhanced- 
(63.5 kcal/mol) and DNAbackbone groups (50.6 kcal/mol). binding ligands. 

Protein backbone groups (6.9 kcal/mol) and DNA bases p ree Energy Analvsis of the Effect of a Point Mutation: 
(11.9 kcal/mol) incur much smaller costs. The desolvation Simulation of a Base-Pair Change in a 434 Repressor-DNA 
penalty is substantial for many groups that become buried at 55 Complex 

the protein-DNA interface. Interestingly, some side chains To address specific issues of recognition and to validate 
that lie nearby but not at the interface also lose significant the results of bur continuum electrostatic investigation, we 
solvation on binding. carried out a free energy simulation study with explicit 

The strong, favorable intermolecular interactions formed solvent. The bound-state starting structure was the high- 
in the complex are made almost entirely with DNA back- 60 resolution complex of Rl-69 bound to 0 A 2 (L. J. W. Shimon 
bone groups. Surprisingly, equal amounts come from inter- and S. C. Harrison, J. Mol. Biol. 232:826-838 (1993)). The 
actions with protein side chains (-42.2 kcal/mol), which mutation was TA-+GC at position 7L. Multiple unbound 
include a large number of charged groups, as with the conformations of the DNA were generated from a 300-ps 
protein backbone (-41.8 kcal/mol), which is only polar 6 5 molecular dynamics trajectory. Five frames of the trajectory 
except for the charged termini. The analysis points to strong were chosen and used as starting structures for the unbound- 
interactions made between the N-termini of alpha-helices state free energy calculation. Although most of the interac- 
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tions between 434 repressor and DNA are in the major The electrostatic free energy of binding is the difference 

groove, this operator mutation occurs near the pseudo-dyad between the electrostatic free energy in the bound and the 

axis where repressor faces the minor-groove side of DNA. A unbound state, &G binding =G bot,nd -G u ' lbot " ui (see FIG. 5a). 

total often simulations were carried out in the unbound state Because the dielectric model includes responses that affect 

and six in the bound state, which led to good statistics. The 5 the entropy as well as the en-thalpy, the electrostatic energy 

results are outlined briefly here (E. J. Simon, "A Molecular is considered to be a free energy. The free energy of each 

Dynamics Study of a Mutation in a Bacteriophage 434 state is expressed as a sum of coulombic and reaction-field 

Operator/Repressor Complex", PhD thesis, Harvard Univer- (hydration) terms involving the ligand (L), the receptor (R), 

sity (1996)). 10 and tneir interaction (L-R): 

fltiaie^n * tate jJ1 i/i state^r* state jT* state. 

The overall stability change is +1.4±0.7 kcal/mol, which & G,^ ale . * U ~* w "** (a) 
disfavors binding to the mutant operator. This is in good 

agreement with experimental values of 0.8-1.2 kcal/mol (G. This results in the following expression for the binding free 

B. Koudelka, et al., Nature (London) 326:886 (1987)). An 15 energy, 

analysis of the source of this overall stability change (B. AG^^ag^^+ag^^+ag^+ag^ (2) 

Tidor "Molecular Modeling of Contributions to Free Energy . ,« * , . c . . . 

, a i- * n * • nun ,l • it a where the fact that the geometry of pomt charges in the 

Changes: ApphcaUor^ to Proteins PhD thesis, Harvard Qr aDd ^ fix J h ^ ^ the model to 

Umversity (1990); B. Tidor and M. Karplus, Biochemistry 2Q cance l the coulombic self contribution of ligand and receptor 

30:3217 (1991); and B. Tidor Proteins: Struct., Funct., and where the two L-R terms are due only to the bound state 

Genet 19:310 (1994)) was carried out and shows a strong because the ligand and receptor are assumed not to interact 

repulsion between the side chain of Arg43L and the N2 in the unbound state. (Note, however, that the charge dis- 

amino group of the mutant guanine. This is a remarkably tribution for the receptor need not be the same in the bound 

interesting interaction because it suggests that this arginine 25 and unbound states. If they are different, this adds a constant 

acts as a negative determinant of specificity by "interfering" to kG hinding that can be dropped in defining AG wr in Eq. 

with a guanine at this position. The array of hydrogen-bond ( 3 »- ^ © describes the electrostatic binding free 

. . A . AT . t . f tU energy as a sum of desolvation contributions of the ligand 

acceptors in the minor groove in this AT-nch region of the X ./t_-i_ r , t . 

* . 4 . iL « i ■ , . , . . 4l _ and the receptor (which are unfavorable) and solvent- 

mutation site and the flanking phosphate groups polarize the 3Q screeQed elec £ ostat v ic interaction m the bound state (which 

surrounding solvent water to interact favorably with this fe favorable ). since the goal is to vary the ligand 

negative potential. The introduction of the Gua N2 donor to charge distribution to optimize the electrostatic binding free 

the minor groove effectively repels this polarized solvent. energy and the last term simply adds a constant, a relevant 

The repulsion is stronger in the unbound than the bound state variational binding energy is defined, 

because solvent is displaced from this region of the minor 35 

groove on protein binding. 

Comparison of these free energy simulation results with in which the first two terms on the right hand side (RHS) of 

the continuum electrostatic study shows essentially the same Eq- ( 2 ) have been combined into a screened interaction term 

dissection for the interactions of Arg 43, including the ^ the constant term has been dropped. Note that 

solvent polarization effect This comparison demonstrates 40 _ _^ 4 

the accuracy of continuum methodology relative to explicit *Wi = |j U v i = h 9j\YJS(rj) + *W <7Jfl 
simulations. The present computer-implemented process is 

based around the continuum approach, which is more eco- and 

nomical and can be used to analyze an entire binding site at 45 l l 

once, rather than one group at a time. Free energy Simula- AC w= 2^ q,v %*£ffi)~ ^Zj 9 ^^^^ 

tions are used primarily to examine points of disagreement ,tL ,cL 
between continuum theory and experiment. 

where v^*" 1 * is the total electrostatic potential in the indi- 

APPENDIX 50 cated state due to the ligand charge distribution only and 

FIG. 5. illustrates problem geometries. FIG. 5a. shows the Y^njT" j? «J» ^ombic or reaction-field (hydration) 

.... % , /n\ j t_ • ti* j term, as mdicated. The summations are over atomic pomt 

binding reacUon between a receptor (R) and spherical ligand cfaarges ^ ^ Hgand ^ Qf feceptor QGR) ^ ^ Qf 

(L) that dock ngidly to form a spherical bound-state com- ^ in Eq ^ is due to the fact that the ligand charge 

plex. Receptor, ligand, and complex are all low-dielectric 55 distribution interacts with the self-induced reaction field, 

media (Gj) that are surrounded by high-dielectric solvent V cc>^v , fco "' u, , Vhydj!*™"** and V Ay<//Z# "" toW , the three elec- 

(Gj). FIG. 5b. shows that the boundary-value problem trostatic potentials in Eqs. (4) and (5), are expressed in terms 

solved here involves a charge distribution in a spherical 0 f the given geometry and charge distribution by solving the 

region of radius R with dielectric constant ^ surrounded by boundary-value problem shown in FIG. 5b, A charge distri- 

solvent with dielectric constant The origin of coordi- 60 bution (corresponding to the ligand) is embedded in a sphere 

nates is the center of the larger spherical region, but the 0 f radius R. The center of the sphere is taken as the origin 

charge distribution is expanded in multipoles about a point Q f coordinates (unprimed) but the charge distribution in 

a distance d along the z-axis. The geometric requirement is multipoles is expanded about a second origin (primed) 

that the ligand sphere not extend beyond the receptor sphere, 65 translated a distance d along the z-axis, so that 
R^d+a, although the case of equality is illustrated in the 

figure. ?bMy?{dfitr*A^?VP#)- <<>) 
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The potential everywhere satisfies the Poisson equation. 

Inside the sphere, it may be written as, « ' j 4 » < 14 > 

Zjei \r-n\ = 2j2j*U^"J 

Zoo / (7) 11 *=0 m— I 

,~' ^/ ZE^W^ 5 do I 



where the first term on the RHS is the coulombic and the 

second is the reaction-field (hydration) potential, and the jq 

summation over i corresponds to the ligand point charges. nmWpoto distribution is taken about the point 

Outside the sphere, the coulombic and reaction-field poten- _^ 

tial can be combined and written as, <* , but the potential is expressed as a summationof spheri- 
cal harmonics about the large-sphere center. The above 

t (8 ) 15 equation can also be written as, 

where A />m and B />m are to be determined by the proper 20 „ t t i i 

boundary conditions and Y /fOT (8, 4>) are the spherical har- V V f(^) a^i ^2^^^(2m) ,fi!^ ■ i, 
monics. The coulombic term in Eq. (7) is expanded in 
spherical harmonics and multipoles of the charge distribu- 
tion about the center of the sphere. Here the origin of the 

multipole expansion is shifted to d , whefe { ^ ^ ^ same y ^ (Q> ^ ^ g^ped together, 

as opposed to Eq. (14), where terms with the same Q'*,^ are 

Z_* = V £ = Y — *! — (9) grouped. 

M , 30 Upon substituting Eq. (15) into Eq. (7) and matching 

= Z E ^^^^Sr 2 < 10 > boundary conditions at r=R, 

where Q f /m is a spherical multipole expanded about the 35 dV > dVoja (17) 

primed origin, a , dr dr ^ R 

1 40 the hydration (reaction-field) potential inside the sphere is, 



The definition of the Y /m (0, used by Jackson 1 is adopted. ^VV, to ^ 

The expression in Eq. (10) is valid for r'>r\. (i.e., outside the v **w ~£j 2j -WW* « 
ligand or, more precisely, outside the sphere whose center is 

at d and whose radius is the longest distance between d -VV f 471 1Vr u (g 0)f —3-1 



(18) 



and a point charge). 

To substitute into Eq. (7) and combine terms involving 



/«0 



spherical harmonics, first Y /fm (6', <K)/r'' +1 of Eq. (10) is V ft d'-'f— IV/ 

expanded in terms of Y />m (0, ^/r 74 * 1 . This is done using the 50 fed " W + U - • 

results of Greengard, 2 which state that for r>d, 



(19) 



where 



'^W + lX^' + a + l)] 55 



<* Tne var i ous y» s can ne rewr itten, with their dependence on 

whcre the Q'*,^ made explicit. V^ w/ /< w is given by Eq. (10), 

60 V hydtL bound is given by Eq. (19) but rewritten so that the 

F(r+/+m'+m)!(r + J-m'-m)n* (13) terms with the same are collected, and V^ dJL unbound 

Kfsn'jjn = [ {V + m , W > - m 'y.{l+mW-m)\ \ ; is given by Eq. (19) with R=a and d=0. 

Since a geometry with 0^=0 (FIG. lb) has been used, only 65 K «*u. * r ; = Zj Zj S+T^ 

m=0 terms in Eq. (12) are non-vanishing, in which case Eq. 1=0 m= "' 
(10) becomes, 
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-continued 0 

(=0 m=-J if ml j 
1 

(^Tl) 5 ^^-w^^"'(^T/ ® where 

• i 4n c (23) 



(30) 



Substituting into Eq. (4), the dependence of AG^^^ on the and y /#m (^) is the operator obtained by replacing r with V . 

Q'*/,m ^ made explicit, p or positive m and when y (^) operates on a solution of 

„ r . 15 the Laplace equation (i.e., r% Je, <j>) or Y />m (9, <|>)/r /+1 ), it 

= YiVV^ffW + Vj^Cr/)] (24) has been shown that, 3 



i (31) 



/=0 W=-/ 

1 



for m & 0. 



The double-factorial is defined as 

25 

(2/ + 1)!! =(2i+l)-(2/-l).(2/-3)...3*l (32) 



~ 2'/! 

30 



(33) 



where in the last line the element a ljn is defined, which is . . 

independent of the to be the factor multiplying Q'% and the spherical partial derivatives are 

in Eq. (25). Each a 7y7J expresses the contribution of a 

multipole to AG*,^* and contains all information concern- 35 ^ = __L (Vjr + lV )f v _ t = _L (Vx - v 0 = v x . (34) 

ing the receptor charge distribution required to obtain AG^ V2 V2 

For AG^^ it is useful to re-express Eq. (5) in terms of 

the Q'* /|m , the multipoles describing the ligand charge 

distribution, rather than the individual charges, q £ . V(7) is Tq CQmpute for negative m> the fact uat Y/ _ m ( 9> 

expanded around the center of the multipole expansion, if, 40 c(>)=(-l) m Y* />m (e, <|>) is used and the definitions of spherical 

partial derivatives in Eq. (34) to obtain, 

5>Vtf) «^«Vp+J?) (27) t 

= + (28)45 

fcL form 2:0. 

It has been shown by Rose that in spherical coordinates the 

expansion becomes, 3 The hydration energy of the bound ligand is then 

<W«j£* , *B , P +; 0 =5Z S (2^^V^(V)V^(?) (36) 

= S Z oFTun^'^ZZZl^r)'!^) 5 x <37) 



^=0 rn'o-i* /=0 mo-/ r=i 

f=3 
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To evaluate y l/n ($) in Eq. (37), Eq. (31) and the gradient 
formula are used, 4 



30 

The hydration energy of the unbound ligand is obtained 
by setting d=0 and R=a in Eq. (46), 



(38) 



-•unbound 
'hydj. 



/=0 m=-/ 
00 / 



(49) 



/=0 m=— / 



where 



10 



Ti/JbQm Yi CaM./^i-m'.m')^^^^ (39) 

m'ct-l.O.ll 



the C(i', 1, 1; m-m', m*) are the vector addition (or Clebsch- 
Gordon) coefficients frequently encountered in the study of 
angular momentum shown in Table I, 4 and f m , are spherical 
unit vectors, 



where v />m is denned by Eqs. (48) and (49). Then, y I/n is 
written as a function of both 1 and m for notational 
convenience, although there is no formal dependence on m. 
Thus AG var has been expressed as a function of the 
15 multipoles of the ligand charge distribution, Q' /#m (expanded 
about the center of the ligand sphere) and the elements CL lyn , 
P/,m,r m* ^ Y^m which do not depend on Q' 7)rtJ . Combining 
Eqs. '(26), (47)T and (49) gives 



Li = ^U*- tf). to = t 



(40) 20 



00 i 

AG V „ = Z Z ar ^ Q 2« + 



(50) 



/c0 OT— / 



Accordingly, 



25 



Z Z Z Z /Wyet»y ~Z Z *rfMS- 

/«0 m=-l i'sOm'- i* 



<=0 /n=-/ 



(41) 



From Eqs. (38) through (41), 

V H (r%^(e,4»-(-mK2^^ 

Using Table I, Eq. (31), and Eqs. (37) through (42), the 
following intermediate results are obtained, 



v{;^(/'^)= 



(43) 



Note that only the a l>m depend on the receptor charges, while 
the p/^/-^. and y im depend solely on the geometry of the 
bounded unbound states. While AG^ is a real quantity, 
30 the a I>m and Q' /m are complex and the products a Ir7n Q**,^ 
and Q'*/^ Q',.^ involve summations over terms of the form 
^^6^0^(6, $)); note that the p, WVTJ . and y ItTn are real. 
Then AG var opr is rewritten in terms of the real and imaginary 
parts of a l m and Q',^, 



35 



(2T + !)(/* +m)!(r-;n) ! 



(2i» - 2/' +2m' + 1)(/" -m - /' +f»0! 
+/n-/'+m')! 



AC 



= V L O e!,o + *Z (Reor^ReGU + Imor^Ime^) 



(51) 



40 



(44) 



, r pr - 2£ + 2m; + !)(/»- m - Z' + m') ! II - t 
C " ir I 2*'(2i*-2i' + l)(*'-m-/' -/«')! J 

^r w wwj= (45) 

( " ir I 2^{2i»-2/' + l)(f + m-/'-/n')! J ^ 



/W.oCS,oGi'.o + 



45 



i=0 /'«o 1 



50 



and the final expression for the hydration energy of the (where the summations over m are excluded for 1=0) by 
ligand in the bound state, noting again that Y /f _ m (8, ♦M-l) m Y*4~( e i ♦) 



00 j ©a 



(=0 in—* f-O 



1 

{l*-D\{l»-V)\ m) !(/ - m) !(*' + m) !(/' - m) ! J 



s ZZZ Z ^vei;^ 



(47) 



where P /|m/fm . is defined by the above two equations; note 
that p 7f/nAm , is zero for m' m. 



65 



Y* 



-^R^^^^ (53! 
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The new variables ReQ' //m and ImQ* 7fm are re-indexed and 
renamed Q £ as follows, 

ReQ^a, . . . }^{Q 1( Q 2 ,Q 3 ,Q*Q5>Q6>Q7,Qb, . - • }♦ (54) 

and similar transformations are used to create a i9 p,-,-, and 
Eq. (51) can then be written as 



OO M 



M 



00 CO 



= Z + Z Z <A* " 6 ifn)QiQj 



(55) 



(56) 



or in matrix notation, 



AGvar =Q T TQ+Q T A 



(57) 
(58) 



where (J is the vector formed by the Q„ A is the vector 
formed by the o^-, B is the symmetric matrix formed by the 
(Pi>— &yY«)> completion of the square has been used to 

arrive at Eq. (58). Since 7} T S Cf in Eq. (57) corresponds 



10 



15 



20 



32 



2Z - StfiXtr + + <*,) = 0 



(61) 



which is analogous to Eq. (59). 

The above matrix equations, with the dimension truncated 
at imaxKbax+l) 2 * can be solved numerically by relatively 
modest computational resources. In practice, since the cx £ 
and p,y contain a summation over an infinite number of 
terms, a second cutoff value of l CMf must be used to truncate 
the innermost sum in Eqs. (25) and (46). When l mflX and \ aif 
are sufficiently large, ££j var opt converges and the incremen- 
tal advantage of including more multipolcs essentially van- 
ishes. 



For any given receptor and geometry, we have thus 
described a method to determine the charge distribution of 
the tightest binding ligand as a set of multipoles. The 
deviation of the binding free energy from the optimum for 
25 any test ligand can be calculated by subtracting Eq. (60) 

from Eq. (58) and using Eq. (59) to eliminate A, 



TABLE I 



i = r + 1 



i - r - 1 



m* = 1 



Cfl'. 1. 1: m - m'. mV 
m» = 0 



m' = -1 



f (l , + m)Q , + m + 1) ]* \ (T-m + 1)(1' + m + l) "j* [ (T - m)(l' - m+ 1) 1* 
[ (2V + 1)(21' + 2) J [ (21' + 1)0' + 1) J [ (21' + 1)(21' + 2) J 



r 0' + m)(l'-m + l) 1* n'n' + i)i* 
[ 2V<y + 1) J 



F (T - m){\' + m + 1) 1* 
[ 2l'(l' +1) J 



r Q'-m^-m + l) ]" f Q' - m)(T + m) "| * [ (T + m + 1)Q' + m) 1** 

[ 2i'(2i' + 1) J " [ rca'+i) J [ aw+i) J 



Q from reference 4 



to the ligand desolvation penalty, which must be greater than 

zero for chemically reasonable geometries, the matrix 3 is 
positive definite and the extremum of AG var is a minimum. 5 50 

From Eq. (58) the optimum values of the multipoles, Q opt 
and the minimum variational binding energy, AG Vflr 0i * are 
obtained, 

55 

(59) 



(60) 



60 



LG va f pt is always negative because S" 1 is also positive 
definite. 

To solve for the optimal multipole distribution with the fi5 

monopole (total charge) fixed (Qi=&), the equation for the 
remaining optimal multipoles (i^l) is, 
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Having now described a few embodiments of the present 
computer-implemented process, it should be apparent to 
those skilled in the art that the foregoing is merely illustra- 
tive and not limiting, having been presented by way of 
example only. Numerous modifications and other embodi- 
ments are within the scope of one of ordinary skill in the art 
and are contemplated as falling within the scope of the 
present process as defined by the appended claims and 
equivalent thereto. 
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SEQUENCE LISTING 



<160> NUMBER OP SEQ ID NOS: 1 

<210> SEQ ID NO 1 

<211> LENGTH: 13 

<212> TYPES PRT 

<213> ORGANISM: Influenza A virus 

<400> SEQUENCE: 1 

Pro Lys Tyr Val Lye Gin Asn Thr Leu Lys Leu Ala Thr 
1 5 10 



20 



25 



What is claimed is: 

1. A computer-implemented process for identifying prop- 
erties of a ligand for binding to a target molecule in a solvent 
comprising the steps of: 

receiving an indication of a selected shape of the ligand, 
defined in three dimensions, which complements a 
shape of a selected portion of the target molecule, 
defined in three dimensions; 

determining a representation of a charge distribution 
which minimizes electrostatic contribution to binding 
free energy between the ligand and the target molecule 
in the solvent. 

2. The computer-implemented process of claim 1, 30 
wherein the representation of the charge distribution is a set 
of multipoles. 

3. The computer-implemented process of claim 1, further 
comprising the step of identifying a ligand having point 



charges that match the representation of the charge distri- 
bution. 

4. The computer-implemented process of claim 2, further 
comprising the step of identifying a ligand having point 
charges that match the representation of the charge distri- 
bution. 

5. The computer-implemented process of claim 1, further 
comprising the step of designing a combinatorial library 
containing ligands having point charges that match the 
representation of the charge distribution. 

6. The computer-implemented process of claim 2, further 
comprising the step of designing a combinatorial library 
containing ligands having point charges that match the 
representation of the charge distribution. 
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Appendix B 

Optimization of electrostatic binding free energy 

Lee-Peng Lee 

Departments of Chemistry and Physics, Massachusetts Institute of Technology, Cambridge, 
Massachusetts 02 J 39-4307 

Bruce Tidor 3 * 

Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachussets 02139-4307 
(Received 9 December 1996; accepted 24 February 1997) 

An analytic result is derived that defines the charge distribution of the tightest-binding ligand given 
a receptor charge distribution and spherical geometries. Using the framework of continuum 
electrostatics, the optimal distribution is expressed as a set of multipoles determined by minimizing 
die electrostatic free energy of binding. Results for two simple receptor systems are presented to 
illustrate applications of the theory. © 7997 American Institute of Physics. 
[S0021-9606(97)50221-2] 



I. INTRODUCTION 

One mechanism operating in many diseases is the unde- 
sirable action of a protein (here termed receptor) that can be 
arrested, at least in principle, through the tight binding of a 
molecular ligand (e.g., by sterically blocking the active site 
or by preventing a required conformational change). 1 To be 
effective as a drug, such a molecule must possess a number 
of important pharmacological activities, such as bioavailabil- 
ity and non-toxicity. One step in the discovery of drug mol- 
ecules is the identification or design of tight-binding ligands. 
Ligand design is particularly difficult because opposing con- 
tributions to the free energy of binding must be properly 
tuned. For instance, increasing the magnitude of a point 
charge in a ligand can enhance its interaction with receptor 
(tending to favor binding), but it will also enhance its inter- 
action with solvent in the unbound state (tending to disfavor 
binding). What magnitude charge should be chosen to bal- 
ance these effects and produce the most favorable free en- 
ergy of binding? The question can be generalized to all mul- 
tipole terms of the ligand charge distribution. The charge 
distribution that optimally balances these effects will bind 
tightest to the receptor. 

Here the problem of determining the ligand charge dis- 
tribution binding tightest to a given receptor is addressed 
using continuum electrostatic theory. In Section II a solution 
is presented for the case in which both the free ligand and the 
bound complex are spherical regions of low dielectric sur- 
rounded by aqueous medium of high dielectric and the be- 
havior of the system is governed by the Poisson equation. To 
facilitate an analytic solution the following assumptions are 
made: the ligand and receptor do not interact in the unbound 
state, the ligand charge distribution is the same in the bound 
and unbound state, and the ligand binds rigidly to the recep- 
tor with a unique orientation. The optimal charge distribution 
is obtained by expressing the ligand charge distribution as an 
arbitrary set of multipoles and minimizing the free energy of 
binding with respect to the multipoles. In Section EI the 
theory is applied to a highly symmetric charge distribution 



n) Auuior to whom correspondence should be addressed. 



devised for test purposes and to a second charge distribution, 
the terminus of an alpha-helix, present in some protein bind- 
ing sites. Discussion and conclusions are presented in Sec- 
tion IV. 



II. THEORY 

The electrostatic free energy of binding is the difference 
between the electrostatic free energy in the bound and the 
unbound state, AG bblding =G bound -G unbound (see Fig. la). Be- 
cause the dielectric model includes responses that affect the 
entropy as well as the enthalpy, the electrostatic energy is 
considered to be a free energy. Here we express the free 
energy of each state as a sum of Coulombic and reaction- 
field (hydration) terms involving the ligand (L), the receptor 
(R), and their interaction (L-R) 

ri state — rz state » r-y state , rz state . r* state , ^ state 
u ~ ^couI.L^ ^coul.R^ ^couI.L-R^ ^'hyd.L" 1 " Wyd.R 

# 

+ <?hy&-R- (1) 

This results in the following expression for the binding free 
energy: 

A Gbindin B = A Gco^r* A G hy6XrR + AG hy(U > A G hydfR , 

(2) 

■ 

where we have used the fact that the geometry of point 
charges in the receptor 2 and ligand remain fixed in the model 
to cancel the Coulombic self contribution of ligand and re- 
ceptor, and where the two L-R terms are due only to the 
bound state because the ligand and receptor are assumed not 
to interact in the unbound state. Thus, Eq. (2) describes the 
electrostatic binding free energy as a sum of desolvation con- 
tributions of the ligand and the receptor (which are unfavor- 
able) and solvent-screened electrostatic interaction in the 
bound state (which is usually favorable). Since our goal is to 
vary the ligand charge distribution to optimize the electro- 
static binding free energy and the last term simply adds a 
constant, we define a relevant variational binding energy, 

AG var = AG intx . R + AG hyd , L , (3) 
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/ \ ' ' ' ' ' ' / / 

////// / 
/////// ' 

/"» / Solvent / / T 

' / / ' ' ' / / 

/ / / / / / / / 



(b) 




receptor 



FIG. 1. Illustration of problem geometries, (a) The binding reaction is 
shown between a receptor (R) and spherical ligand (L) that dock rigidly to 
form a spherical bound-state complex. Receptor, ligand, and complex are all 
low-dielectric media {e x ) that are surrounded by high-dielectric solvent 
(e 2 ). 0>) The boundary-value problem solved here involves a charge distri- 
bution in a spherical region of radius R with dielectric constant e, sur- 
rounded by solvent with dielectric constant e z . The origin of coordinates is 
the center of the larger spherical region, but the charge distribution is ex- 
panded in multipoles about a point a distance d along the z-axis. The geo- 
metric requirement is that the ligand sphere not extend beyond the receptor 
sphere, R^d+a, although the case of equality is illustrated in the figure. 



in which the first two terms on the RHS of Eq. (2) have been 
combined into a screened interaction term and the constant 
term has been dropped. Note that 



2„ tfVffl^ + VSEftrJ)] 



(4) 



and 



where V*™ c is the total electrostatic potential in the indicated 
state due to the ligand charge distribution only and is 
the Coulombic or reaction-field (hydration) term, as indi- 
cated. The summations are over atomic point charges in the 
ligand (ieL) or receptor 0* e R) . The factor of \ in Eq. (5) is 
due to the fact that the ligand charge distribution interacts 
with the self-induced reaction field. 

We proceed by expressing V^«, vSff , and 
the three electrostatic potentials in Eqs. (4) and (5), in terms 
of the given geometry and charge distribution by solving the 
boundary-value problem shown in Fig. lb. A charge distri- 
bution (corresponding to the ligand) is embedded in a sphere 
of radius R . We take the center of the sphere as the origin of 
coordinates (unprimed) but expand the charge distribution in 
multipoles about a second origin (primed) translated a dis- 
tance d along the z-axis, so that 

;(r,^^) = 5(rf,^=0,^=0) + r 7 (r , I ^,^'). (6) 

The potential everywhere satisfies the Poisson equation. In- 
side the sphere, it may be written as 

^„(r) = 2^T^T + S SA^W), (7) 

where the first term on the RHS is the Coulombic and the 
second is the reaction-field (hydration) potential, and the 
summation over i corresponds to the ligand point charges. 
Outside the sphere, the Coulombic and reaction-field poten- 
tial can be combined and written as 



00 



/=0 m=-l r 



(8) 



where A/ fin and B lrJJl are to be determined by the proper 
boundary conditions and F/, m (0,<£) are the spherical har- 
monics. The standard way to proceed is to expand the Cou- 
lombic term in Eq. (7) in spherical harmonics and multipoles 
of the charge distribution about the center of the sphere. Here 
we shift the origin of the multipole expansion to d, 



2->-7=2 



(9) 



f r e x \{?-2)-P t \ ' e,|r'-^| 

= ,? 0m ^ii27+T e ' ,m e ,r'' +1 ' (10) 

where <2/, m is a spherical multipole expanded about the 
primed origin, 5, 



(id 



Note that throughout this work we adopt the definition of the 
^/,m(^»^) used by Jackson. 3 The expression in Eq. (10) is 
valid for r'>rj (i.e., outside the ligand or, more precisely, 
outside the sphere whose center is at d and whose radius is 
the distance from d to the furthest point charge). 

To substitute into Eq. (7) and combine terms involving 
spherical harmonics, we first expand Y ltT „(9' ,<£')/r' ,+ 1 of 
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Eq. (10) in terms of r,, m (0,<£)/r /+1 . This is readily done 
using the results of Greengard, 4 which state that for r>d t 

177+1 



Z' 



2 2 Kl',m',l f 
/'-0 m' = -l f 



4^(2/+!) 



(2r + l)(2/'+2/+l) 



1/2 



./'+/+1 



(12) 



where 



(l' +m')\(l' -m')\(l+m)\(l-m)l m 



1/2 



/ = 0 m = -/ 

' / 47r \l/2 

X ( p2/+l ) 2 Ki-vwj4 l ~ l 
\ K //'-Iml 



4tt 



1/2 



x Ur + iJ fi/,/ ^' 



where 



(^1-^2) 



' ^[Cj + ^/Ci+l)]" 



(13) 



(18) 



(19) 



(20) 



Since we have chosen a geometry with 0 d =O (Fig. lb), only 
m' = 0 terms in Eq. (12) are non-vanishing, in which case 
Eq. (10) becomes 



y gj y y IMtt ^ 1/2 



We can now write the various V's, with their dependence on 
the Q\% made explicit. V'^t is given by Eq. (10), vgg is 
given by Eq. (19) but rewritten so that the terms with the 
same Q\% are collected, and is given by Eq. (19) 

with R = a and d=0 9 



^ coul,tA r >~ A ^ , 9/4.1 c r 'l + l » 

/=0 m— — / £L-r L 6\r 



(21) 



00 



/'=o 



4tt 



1/2 



2/'+2/+l 



P/'-H,m(ft <ft) 
^' + 7+1 



(14) 



in which the multipole distribution is taken about the point 

d, but the potential is expressed as a summation of spherical 
harmonics about the large-sphere center. The above equation 
can also be written as, 



2 2 M 477 ^ i2Yi ' m( ° 9 * ) 



i ei\r—r t \ i«o m«-/ €i\2/+l 



(15) 

where terms with the same Y^ m (6 t <f>) are grouped together, 
as opposed to Eq. (14), where terms with the same Q/,m m 
grouped. 

Upon substituting Eq. (15) into Eq. (7) and matching 
boundary conditions at r=fl, 



r it 



inlr=K"~ v outl r=/? » 



in 



=^2 



out 



r=R 



dr 



(16) 



(17) 



we obtain the hydration (reaction-field) potential inside the 
sphere, 



1/2 



c„ 



'—Ar l 'Y v , m {e 9 <i>), 



R 



21 



(22) 



CrVVS 2_ f ^^Jfi^iW W (23) 

Substituting into Eq. (4), we make explicit the dependence of 
AGint.L.R on the Q\ % , 



(24) 



-2 SfliSSft 

/=0 /n = -/ ;gR 



4tt \ y,, m (g;,^') 

2/+1/ C!^"*" 1 



00 



+ 2 



477 



1/2 



4tt 



1/2 



(25) 



00 / 



2 a l,mQl,mi 
1=0 m=-/ 



(26) 



where in the last line we have defined the element or /tm , 
which is independent of the g/ * , to be the factor multiply- 
iBg 2i,m m Eq- (25). Each a / m expresses the contribution of 
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a multipole to AG^r and contains all information con- 
cerning the receptor charge distribution required to obtain 

For AG hyd>L it is useful to re-express Eq. (5) in terms of 
the Q'* m , the multipoles describing the ligand charge distri- 
bution, rather than the individual charges, q { . We expand 
V(r) around the center of the multipole expansion, d, 



2 = 2 q t V0+ft 



(27) 



= 2 qi[V(2)+? r VV(2)+--l. (28) 

ieL 

It has been shown by Rose that in spherical coordinates the 
expansion becomes, 5 



2 qiV(d+fl)=^ 2 75j^e'U^ m (vw5) 

ieL /=0 m=-l 



(29) 



where 



^ m (r>r'r,, m (W), 



(30) 



and ^/ >m (V) is the operator obtained by replacing r with 
V. For positive m and when Pi %m (^) operates on a solution 
of the Laplace equation (i.e., r l Y lffn (0,<i>) or 
Y x m ( 0, <f>)/r l+l ), it has been shown that 5 



^ v (20! 

j^*(V)= w 



21+1 



im 



4tt j(l + m)\(l-m)l 



1/2 



The double-factorial is defined as 

(2Z+1)!! = (2Z+1)-(2J-1).(2J-3)...3'1 

(21 + 1)1 
2 l ll 

and the spherical partial derivatives are 



(31) 



(32) 



(33) 



(34) 



To compute m (V) for negative m, we use the fact that 
Y lt - m (0,<f>) = (-l) m Yf tm (6,<f>) and the definitions of 
spherical partial derivatives in Eq. (34) to obtain 



- (20! 
^- w (V)=^r i 7 



2Z+1 



m 



1/2 



4tt l(l + m)\(l-m)l 
xV!,V[f ffl for rn^O. 

The hydration energy of the bound ligand is then 



(35) 



1 



gL 



) 



4-7T 



"2,?o.w(«' + 1 )" 



(36) 



_Iy y 4<7r o l * 



CO / CO 

x2 2 2 

X Q'*m K l"-l,0.l.m d 



I 4 77 



1/2 



47T 



1/2 



2Z+1/ \2/"+l 



2/"+l 



(37) 



r=d 



To evaluate ^/, m (V) in Eq. (37), we use Eq. (31) and the 
gradient formula 6 



$(®(r)Y ltm (e,<f>)) 

-mi 



l+l \ V2 ld<&(r) I 

jr-- r ^))T U+ u&.4>) 



+1 ^)"(™ + ^. (r)|ww >. 



dr 



(38) 



where 



m'e{- 1,0,1} 

X^, m - m ,(M)f m ', (39) 

the ^U\l,/;m-m\m') are the vector addition (or 
Clebsch-Gordon) coefficients frequently encountered in the 
study of angular momentum shown in Table I, 6 and f m * are 
spherical unit vectors, 

| 1 = --L(x+,>). |_,=^(i-iy), lo=z". (40) 

It is straightforward to show that 

V=iV x +i!V y +£V,= -| 1 V- 1 -|_ l V 1 + | 0 V 0 . (41) 
From Eqs. (38) through (41), we have 

V M (r'r,, m (M)) 

= (-l) M [/(2Z+l)] 1/2 
x^l-lXlw + fr-fL^Yt-^+^e.t). (42) 

Using Table I, Eq. (31), and Eqs. (37) through (42), we ob- 
tain the following intermediate results: 
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8685 





m' = l 






m' = 0 










/=/' + ! 


(f'+ro)(/'+w+l) 


1/2 


V- 


■m+lX/'+m+D 


1/2 


(/'-m)(/'-m+l) 


1/2 


(2/' + l)(2/'+2) . 




L (2/' + l)(/' + l) 


■ 




. (2/' + l)(2/'+2) . 




/=/' 


(/'+«i)(/'-m+I) 


1/2 




m 






(/'-m)(/'+m+l) 


1/2 


2/'(/'+l) 






[rc/'+i)] 1 " 






2/'(/' + l) 




/ = /'-! 


(f-ro)(/'-m+l) 


1/2 




(/'-ro)(/'+w) 


1/2 


(/'+/>i+l)(/'+m) 


1/2 


2f'(2/' + l) 






/'(2/'4-l) 






2/'(2/' + l) 





Trom Reference 6. 



(2/"+l)(Z" + m)!(r-j»i)! 



11/2 



(2r-2r+2m / + l)(r-m-/ , + m')!U"+w-r+m')J. 



(43) 



^/"-Z' + m',m)~(~l) 



/n 



(2r-2/ / + 2m' + l)(Z ff -m-r+m')! 



1 1/2 



2 m (2r-2i' + l)(r-m-/ / -m')! 



(44) 



V-(r^-'' + «'r l „_ J , +m ,, m )=(-l)' 
and the final expression for the hydration energy of the ligand in the bound state, 

bound 1 V ound - *V V V V / ^ V^f ^ 



(2r-2/ f + 2ifi' + l)(r + m-r + m')! 



2 m '(2/"-2/' + l)(r +m-r -m')! 



1/2 



1/2 



X 



(r+m)!(/"-i«)! 


1 


(/"-/)!(/"-/')! 


(/+jn)!(f-m)!(/'+in)!(/'-m)! ) 



1/2 



,2/"-/-/' 



00 / 00 /' 

= 2 2,2 2 A,m,/wfi/,2i2/\m" 



(45) 



(46) 



(47) 



where fi^ m jt tin * is defined by the above two equations; note 
that /?/, m ,f, m ' is zero for rn'ifcm. We obtain the hydration 
energy of the unbound ligand by setting d=0 and R = a in 
Eq.(46), 



1 _ 

G unbound "x 1 „ y /unbound/ ~*\ 
hyd.L ^i^hyd.L \ r 0 



= t2 2 



4tt I C\ i , 

27+T 2/, m 2/, 



2/=o m=-/ 2/+1 \a 



■2 Srufl/X- 

i=0 m=— / 



(48) 



(49) 



where ?j >m is defined by Eqs. (48) and (49). We write y; ?m as 
a function of both / and m for notational convenience, al- 
though there is no formal dependence on m. 

Thus AG var has been expressed as a function of the mul- 
tipoles of the ligand charge distribution, Q\ m (expanded 



about the center of the ligand sphere) and the elements 
<*i,m> Pi t m,i' t m' and y,, m , which do not depend on Q\ m . 
Combining Eqs. (26), (47), and (49) gives 



00 I 

AG var =2 2 t <*LmQl % 
1 = 0 m = -l 



* 
m 



00 / oo l ( 

+ 2 2 t 2 2 Pl t m f l'.m'Q'l%Ql',m' 

-2 2 , Y,. m Q'lXQ'l,n, ■ (so) 

/ = 0 m = -l 



Note that only the a i>m depend on the receptor charges, while 
the Pi t mj', m > and y /?m depend solely on the geometry of the 
bound and unbound states. While AG°£ is a real quantity, 
the a i%m and Q\ m are complex and the products <*/, m 2/,m and 
Q\%Q\» involve summations over terms of the form 
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Yt; m (0',<t>')Yi, m (O,4>); note that the p UmtVm . and y Um are 



real. We rewrite AG°£ in terms of the real and imaginary 
parts of a hm and Q\ m , 



AG var =2 

1^=0 



positive definite and the extremum of AG var is a nainimum. 7 
From Eq. (58) the optimum values of the multipoles, Q° ?t 
and the minimum variational binding energy, AG°£J are ob- 
tained, 



a /(0 e/,o+ 2 2 (Rea />m Re(2;, m 

m= 1 



CO 00 



+ImQr,, m Ime/,J + 2 X 

/ = 0 /'=o . 



PlAJ'JoQifiQl'fl 



1 ♦* , .# 



(59) 



(60) 



+ 2 2 A, m , r , m (Ree;, m Ree;,, 



+Imfi/ ini Imfi;, i)n ) 



AG°£ is always negative because B 1 is also positive defi- 
nite. 

To solve for the optimal multipole distribution with the 
monopole (total charge) fixed (Qi = the equation for the 
remaining optimal multipoles 1) is 



-a 



y < ,o<3/.o+2 2 ri,m(Ree,'. 2 m +ime;, 2 m ) 

m=l 

(51) 

(where the summations over m are excluded for Z = 0) hy 
noting again that y,_ m (0, <£) = (- 1 )"%*,( 0,0) and 

y*,j e',4>') Y,, m ( 0, *)+ rf.,_ ) y,,- m ( *. *) 
- Y *J 9 '><t>') Y '.m( e, 4>) + r v ji o' , <t>\ ) y,* m ( «, 0) (52) 

=2[Rer J ,, m ( 0', </>')■ ReF,, m ( 0, <f>) + ImY,,, m ( 0' , </>' ) 
lmy,, m (e.0)]. (53) 

p 

The new variables Reg/ m and ImQ/ m are re-indexed and 
renamed Q t as follows: 

{fiao>fil,o.RefiJ tl ,Ime; tl ,fiJ t o,Refii fl , 
Imfiii.Itofii2.-"} 

^{fii.fi2.e 3 .a4.e5»i26.!27.i28,- •}. (54) 

and similar transformations are used to create a f , y3y , and 
7,-. Equation (51) can then be written as 

00 oo 00 00 



(61) 



which is analogous to Eq. (59). 

The above matrix equations, with the dimension trun- 
cated at Z max = (/ma* -1 - 1) 2 , can be solved numerically by rela- 
tively modest computational resources. In practice, since the 
aj and Pij contain a summation over an infinite number of 
terms, a second cutoff value of Z cut must be used to truncate 
the innermost sum in Eqs. (25) and (46). When I 

max and 

l cut are sufficiently large, AG°£ converges and the incremen- 
tal advantage of including more multipoles essentially van- 
ishes. 

For any given receptor and geometry, we have thus de- 
scribed a method to determine the charge distribution of the 
tightest binding ligand as a set of multipoles. The deviation 
of the binding free energy from the optimum for any test 
ligand can be calculated by subtracting Eq. (60) from Eq. 
(58) and using Eq. (59) to eliminate A, 



AG Vflr - AGy{£=(2~ Q*) T B(& - &*) . 



(62) 



III. RESULTS 



AG var =S *<Q, + 2 2 PijQiQj-jZ V&l < 55 > A - implementation 



00 00 



= 2 «iS/+S 2 (P,j-S,jVi)QiQj, (56) 
i=l i=l J=l 



or in matrix notation, 



&G va =Q T BQ + Q T A 

( 1 ** , •* \ ^ **/ ■* 1 ♦* , — 



(57) 
(58) 



where g is the vector formed by the Q it A is the vector 
formed by the a ( , B is the symmetric matrix formed by the 
(Py- Syyi), and completion of the square has been used to 
arrive at Eq. (58). Since Q T BQ in Eq. (57) corresponds to 
the ligand desolvation penalty, which must be greater than 
zero for chemically reasonable geometries, the matrix B is 



The algorithm described was implemented in a computer 
program whose input was Z max [which determined the size of 
the matrix in Eqs, (59) and (61)], Z OTt [which was used to 
truncate the innermost summation in Eqs. (25) and (46)], the 
geometry of the problem, and whether the monopole of the 
optimum was to be free or fixed at some value. The geom- 
etry of the problem included the radius and coordinates of 
the center of both the bound-state and ligand spheres (on the 
z-axis) and the coordinates and magnitude of each partial 
atomic charge in the system. The dielectric constants and 
e 2 were chosen to be 4 and 80, respectively. Evaluation of 
the a?/ , fiij , and y x was carried out, followed by solution of 
the matrix equation [Eq. (59^or (61)] using LU decomposi- 
tion. The eigenvalues of the B matrix were obtained to verity 
that the stationary point was a minimum. All real floating- 
point values were represented using 64 bits. The matrix al- 
gebra was accomplished using increased-precision versions 
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of the appropriate subroutines given by Press et al. The out- 
put of the program included the multipoles for the optimal 
charge distribution, AG°£, the nature of the stationary point, 
and a file recording the a, , (5^ and %• . Typical CPU usage 
for a receptor with / max = l cni —^0 was 20 minutes on a 
Hewlett-Packard 9000/735 with the PA-7200 (99 MHz) 
chip, and the maximum memory used was roughly 22 MB. 
Because we have used a direct method (i.e., LU decomposi- 
tion) to solve the matrix equation, where the matrix is of size 
(fmax+l) 2x (Jraa*+l) 2 > the time scales as (/ max ) 6 and the 
memory scales as (/ max ) 4 - At this point no attempt has been 
made to optimize the code. For example, the matrix equation 
contains a particularly sparse matrix (due to the azimuthal 
geometry chosen for the problem) that may be used to reduce 
the necessary computational effort. The optimization prob- 
lem may also be solved with iterative methods, such as the 
conjugate gradient method or various relaxation methods. 

B. Test problems 

The first test problem consisted of a receptor with four 
parallel dipolar groups, each containing a negative charge of 
— 0.55e in the z= 15.50 plane and a positive charge of 
4-0.55e in the z= 14.25 plane. All lengths and distances are 
given in units of angstroms (1 A=0.1 nm). The (x,y) coor- 
dinates of the charges were (+1.5, + 1.5), (-1.5, + 1.5), 
( — 1.5,-15), and (+1.5,-1.5). The bound-state low- 
dielectric region was bounded by a sphere of radius 24.0 
centered at the origin, the ligand sphere was of radius 4.0 and 
was centered at (*,y,z) = (0.0,0.0,20.0) in the bound state. 

The second test problem consisted of an idealized alpha- 
helix as the receptor. The helix was constructed from 18 
alanine residues with acetyl and N-methylamide blocking 
groups at the N- and C-tenmnus, respectively. Coordinates 
were generated in the polar-hydrogen representation with the 
CHARMM PARAM19 (Ref . 9) bond lengths and angles and with 
57° and if/~ —47°. The partial atomic charges were 
adapted from the PARSE parameter set 10 The axis of the helix 
coincided with the z-axis of the coordinate system and the 
nitrogen atom of Ala 10 was closest to the origin. The 
bound-state low-dielectric region was bounded by a sphere 
of radius 24.0 centered at the origin, the ligand sphere was of 
radius 4.0, and the ligand multipole distribution was centered 
at (jc,y,z) = (0.0,0.0,20.0) in the bound state (near the 
C-tenninus of the poly-alanine alpha-helix). 

C. Analysis of results 

Each test problem was solved multiple times using dif- 
ferent values of / mox (with fixed at 40 for the results 
shown here, though essentially indistinguishable results were 
obtained when the value was increased to 80) and with the 
monopole of the variational distribution either free or fixed at 
0 or + le. Figure 2 shows the convergence of the calculated 
AG°£J as a function of the value of used (part a is for the 
four-dipolar-groups problem and part b is for the alpha- 
helix). In all cases the calculated AGj|£ was monotonically 
decreasing for increasing and for any value of 
AG^ar was lower (more favorable) with free than with fixed 
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FIG. 2. Convergence of AG^jJ as a function of the value of 1^ used in the 
calculation. A constant value of / cat =40 was used throughout Optimiza- 
tions in which the total charge on the ligand was free are plotted with 
(□), fixed at 0 with (X), and fixed at 1 with (0) for (a) the four dipolar 
groups and (b) the alpha-helix. 

monopole value, as expected for a variational optimization. 
For both test problems the value of AG^S appeared to 
change very little beyond an of 20 for floating or fixed 
monopole. Figure 3 shows the magnitude of the low multi- 
pole moments of the optimized distribution as a function of 
*max (with free monopole va lue). The magnitude of the 

2'-pole is defined as | g/l 53 V(4ir/(2Z + l))2 m (g/,m /*') 2 » 
where a is the ligand radius. The magnitudes of the first six 
multipoles converged by an of 10. Figure 4 shows the 
Coulombic potential due to the calculated optimal ligand, 
again as a function of 1^; the potential appeared nearly 
converged at an of 20. The converged Coulombic poten- 
tial of the optimal ligand, plotted in the xy-plane just outside 
the ligand (at z= 16.0) and computed with an l max of 40, is 
shown in Fig. 5a for the four-dipolar-groups problem and 
Fig. 5c for the alpha-helix. The optimal ligand* s potential 
contained the appropriate four-fold symmetry to match that 
of the four-dipolar-groups receptor, indicating that such a 
ligand would interact equally with all four dipoles. However, 
for the alpha-helix, which presents a coil of dipolar groups 
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FIG. 3. Convergence of the magnitude of the lowest seven 2'-poles for the 
optimal ligand as a function of the value of 1^ used in the calculation for 
(a) the four dipolar groups and (b) the alpha-helix. The optimizations were 
performed with no constraint on the total ligand charge. In (a) note that the 
1-4 and /=5 lines fall nearly on top of one another. 



receding in the z-direction, it appears that the optimal ligand 
computed in this manner would interact strongly only with 
the closest dipolar group. It is also interesting to note that the 
Coulombic potential due to the optimized multipole distribu- 
tion calculated in this way is not a simple reflection of the 
Coulombic potential for the isolated receptor. Compare, for 
instance, the coulombic potentials due to the optimized 
ligand (Fig. 5a) and due to the receptor (Fig. 5b) for the 
four-dipolar- groups problem, both computed in the z=16.0 
plane. The peaks in the ligand potential are "inside" those 4 
of the receptor potential. This may turn out to be a general 
feature of electrostatically optimized binding interactions, 
which are fundamentally asymmetrical, since one distribu- 
tion is fixed while the other is optimized. 

IV. DISCUSSION AND CONCLUSION 

Analytic solutions to the Poisson equation have been 
used to define the multipole distribution of the ligand that 
produces a minimum for the free energy of binding a spheri- 
cal ligand to an invariant receptor to form a spherical com- 
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FIG. 4. Convergence of the Coulombic potential due to the optimal ligand, 
plotted along the line (y = — l.l,z= 16.0) for a range of values of , for 
(a) the four dipolar groups and (b) the alpha-helix. The optimizations were 
performed with no constraint on the total ligand charge. Note that the curves 
for / mflX =20 and 40 are nearly identical. 

plex. An algorithm has been developed and implemented us- 
ing numerical computation to evaluate the analytic theory, 
and results have been presented for two test cases. In all 
solutions examined to date, second-derivative analysis has 
verified that the stationary point is a minimum. In this sense, 
the multipole distribution is said to be an optimum. An im- 
portant feature of the theory presented is that, by expressing 
the optimum as a multipole distribution, it can be solved for 
directly, without resorting to stochastic searches or other 
non-deterministic methods of optimization. This character- 
ization of the multipole properties of the optimal charge dis- 
tribution for a given spherical ligand shape and binding ge- 
ometry may be useful in understanding complementary 
interactions in molecular binding and recognition. Such 
properties may prove particularly applicable to the field of 
ligand design either by facilitating the construction of indi- 
vidual tight-binding ligands or by providing descriptors that 
can be used to search compound libraries or aid in the design 
of combinatorial libraries. 

The observation that an optimum can be defined within 
the continuum model presented here so as to provide the 
greatest excess of favorable interactions between ligand and 
receptor over unfavorable ligand desolvation energy suggests 
that the successful design of a tight-binding ligand may in- 
volve substantially more than the construction of a 
complementary-shaped molecule that provides compensating 
interactions for polar and charged groups in the receptor 
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FIG. 5. Contour plot of the Coulombic potential in the z= 16.0 plane for (a) the optimum ligand for the four dipolar groups, (b) the four dipolar groups 
themselves, (c) the optimum ligand for the alpha-helix, and (d) the alpha-helix itself. The optimizations were performed with no constraint on the total ligand 
charge. Each plot consists of equally spaced contour levels. Each label marks the closest contour level and is valid to three decimal places (i.e., 0.32 in (a) is 
0320 and -0.8 in (b) is -0.800), except -1.1 in (d), which is -1.090 but was rounded for clarity in the figure. 



binding site. For example, the electrostatics of compensating 
a neutral, polar carbonyl group in a receptor with a neutral, 
polar hydroxy! may be substantially different than comple- 
menting it with a positively charged ammonium group. 
Moreover, due to the effects of longer-range electrostatic in- 
teractions, merely discussing the problem in terms of indi- 
vidually compensating pairs of groups may be inappropriate, 
since each group affects the overall multipole moments of 
, the ligand. To help answer these questions, we are currently 
studying algorithms for designing sets of point charges, as 
well as molecules, that have multipole moments correspond- 
ing closely to the optimum defined by this algorithm. 

The properties of the optimal multipole distribution and 
binding energy are worthy of further study. Here we note 
that A G^J is always negative (favorable), but that the overall 
binding free energy, AG binding , may or may not be positive 



(unfavorable) due to the receptor desolvation energy. 11 
Moreover, it is straightforward to prove that the magnitude 
of the screened ligand-receptor interaction free energy is 
twice that of the ligand desolvation energy at the 
optimum (AG? p t : L . R =-2AGS^, so AG^="AG^ L 
=|;AG^ L . R ), and that the same relationship holds for the 
contribution of each multipole component, g° pt . 12 Finally, 
the relationship between the Coulombic potential of the op- 
timized charge distribution and that of the receptor reveals 
non-trivial features that reflect subtleties of how best to 
achieve favorable interactions in the bound state relative to 
ligand desolvation. For the example involving four dipolar 
groups, this suggests that chemical groups compensating 
each dipole should lie closer to the azimuthal axis than the 
corresponding receptor dipole. 
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The cuiTent theory provides a useful starting point for 
further studies. We are presendy investigating extensions to 
solve the linearized 13 and the non-linear Poisson-Boltzmann 
equation, which would allow ionic-strength effects of the 
aqueous medium to be included. Moreover, it may be pos- 
sible to release the restrictions that both the unbound ligand 
and the bound complex have spherical geometry, that the 
charge distribution of the ligand be the same in the bound 
and unbound states, and that titratable groups be treated in a 
fixed protonation state. It should be noted that there is no 
restriction in the current theory on the shape or charge dis- 
tribution of the unbound receptor, since its contribution is a 
constant that has been eliminated in the definition of 
AG var . 
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Claims 

1 . A method of modulating the antigen-binding affinity of an antibody comprising, 

determining a spatial representation of an optimal charge distribution of the amino 
5 acids of the antibody and associated change in binding free energy of the antibody when 
bound to an antigen in a solvent; 

identifying at least one candidate amino acid residue position of the antibody to be 
modified to alter the binding free energy of the antibody when bound to the antigen; and 
selecting an elected amino acid residue for substitution for said amino acid 
10 position, 

such that upon substitution, the antigen-binding affinity of the antibody is 
modulated. 

2. The method of claim 1, further comprising substituting the elected amino acid residue 
1 5 at the candidate amino acid residue position. 

3. A method of modulating the antigen-binding affinity of an antibody comprising, 

determining a spatial representation of an optimal charge distribution of the amino 
acids of the antibody and associated change in binding free energy of the antibody when 
20 bound to an antigen in a solvent; 

identifying at least one candidate amino acid residue position of the antibody to be 
modified to alter the binding free energy of the antibody when bound to the antigen; 
selecting an alteration for said amino acid position, 
such that upon alteration, the antigen-binding affinity of the antibody is 
25 modulated. 

4. The method of claim 3, wherein the alteration is selected from the group consisting of 
a deletion, an insertion, and an alteration of side chain chemistry. 

30 5. The method of claim 1 or 3, further comprising calculating the change in the free 

energy of binding of the antibody containing the modified amino acid or alteration when 
bound to the antigen, as compared to the unmodified antibody when bound to the antigen. 

6. The method of claim 5, wherein the calculating step first comprises modeling the 

35 modification or alteration of the antibody in silico, and then calculating the change in free 
energy of binding. 

7. The method of claim 6, wherein the calculating step uses at least one determination 
selected from the group consisting of a determination of the electrostatic binding energy 
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using a method based on the Poisson-Boltzmann equation, a determination of the van der 
Waals binding energy, and a determination of the binding energy using a method 
based on solvent accessible surface area. 

8. The method of claim 1 or 3, further comprising expressing the modified or altered 
antibody. 

9. The method of claim 1 or 3, wherein the modulation is selected from the group 
consisting of an increase in antibody/antigen binding affinity and a decrease in 
antibody/antigen binding affinity. 

10. The method of claim 1, wherein the elected amino acid is from a subset of amino 
acids having characteristic side chain chemistry, said subset of amino acids selected from 
the group consisting of uncharged polar amino acid residues, nonpolar amino acid 
residues, positively charged amino acid residues, and negatively charged amino acid 
residues. 

11. The method of claim 1, wherein the elected amino acid residue increases the free 
energy of binding between antibody and antigen when bound in a solvent, thereby 
decreasing antibody-antigen binding affinity. 

12. The method of claim 1, wherein the elected amino acid residue decreases the free 
energy of binding between antibody and antigen when bound in a solvent, thereby 
increasing antibody-antigen binding affinity. 

13. A method of modulating the antigen-binding affinity of an antibody comprising, 

determining a spatial representation of an optimal charge distribution of the amino 
acids of the antibody and associated change in binding free energy of the antibody when 

bound to an antigen in a solvent, 

identifying at least one candidate amino acid residue position of the antibody to be 
modified to alter the binding free energy of the antibody when bound to the antigen; 

selecting an elected amino acid residue for substitution at said amino acid 

position; 

modeling the elected amino acid residue for substitution in silico, calculating the 
change in free energy of binding of the modified antibody when bound to the antigen; and 

substituting the elected amino acid residue for the candidate amino acid residue 
position such that the antigen-binding affinity of the antibody is modulated. 
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14. The method of claim 13, wherein the calculating step uses at least one determination 
selected from the group consisting of a determination of the electrostatic binding energy 
using a method based on the Poisson-Boltzmann equation, a determination of the van der 
Waals binding energy, and a determination of the binding energy using a method 

5 based on solvent accessible surface area. 

15. The method of claim 13, further comprising expressing the modified antibody. 

16. The method of any one of claims 1, 3, or 13, wherein in the method is repeated at 
10 least one time. 

17. The method of any one of claims 1 or 3, wherein in the method is conducted in silico. 

18. The method of any one of claims 1, 3, or 13, wherein at least one step is informed by 
15 three-dimensional structural data. 

19. The method of any one of claims 1, 3, or 13, wherein at least one step is informed by 
data selected from the group consisting of binding data derived from an expressed 
antibody binding to an antigen in a solvent, crystal structure data of an antibody, crystal 

20 structure data of an antibody bound to an antigen, three-dimensional structural data of an 
antibody, NMR structural data of an antibody, and computer-modeled structural data of an . 
antibody. 

20. The method of any one of claims 8 or 15, wherein expressing the modified antibody 
25 is in an expression system selected from the group consisting of an acellular extract 

expression system, a phage display expression system, a prokaryotic cell expression 
system, and a eukaryotic cell expression system. 

21. The method of claim 1, wherein the antibody, or antigen-binding fragment thereof, is 
30 modified at one or more positions within a CDR region(s) selected from the group 

consisting of V H CDR1, V H CDR2, V H CDR3, V L CDR1, V L CDR2, and V L CDR3. 

22. The method of claim 1, wherein the antibody, or antigen-binding fragment thereof, is 
selected from the group consisting of an antibody, an antibody light chain (VL), an 

35 antibody heavy chain (VH), a single chain antibody (scFv), a F(ab')2 fragment, a Fab 
fragment, an Fd fragment, and a single domain fragment. 
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23. The method of claim 1, wherein the antigen-binding affinity of the antibody is 
predicted to be increased by a factor of about 1.1, 1.2, 1.3, 1.4, 1-5, 1.6, 1.7, 1.8, 1.9, 2, 3, 
5, 8, 10, 50, 10 2 , 10 3 , 10 4 , 10 s , or 10 6 , 10 7 , or 10 8 . 

24. The method of claim 1, wherein the antigen-binding affinity of the antibody is 
predicted to be decreased by a factor of about 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 3, 
5, 8, 10, 50, 10 2 , 10 3 , 10 4 , 10 s , or 10 6 , 10 7 , or 10 8 . 



25. The method of claim 1, wherein the antigen-binding affinity is determined in the 
presence of an aqueous solvent containing salt. 

26. The method of claim 25, wherein the solvent comprises physiological concentrations 
of salt. 

27. An antibody, or antigen-binding fragment thereof, produced by the method of any one 
of claims 1, 3 or 13. 

28. An antibody, or antigen-binding fragment thereof, affinity matured according to the 
method of any one of claims 1, 3 or 13. 

29. A plurality of antibodies, or antigen-binding fragments thereof, produced by the 
method of any one of claims 1, 3 or 13. 

30. A nucleic acid encoding the antibody, or antigen-binding fragment thereof, of claim 
27. 

* 

31. A host cell encoding the nucleic acid of claim 30. 

■ 

32. An antibody, or binding fragment thereof, produced by culturing the host cell of 
claim 31 under conditions such that antibody, or binding fragment thereof, is expressed. 

33. A pharmaceutical composition comprising the antibody, or antigen-binding fragment 
thereof, of claim 27. 

34. A method for treating or preventing a human disorder or disease comprising, 

administering a therapeutically-effective amount of the pharmaceutical 
composition of claim 33, such that therapy or prevention of the human disease or disorder 
is achieved. 
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35. The method of any one of claims 1, 3 or 13, wherein one or more steps is computer- 
assisted. 

36. A medium suitable for use in an electronic device having instructions for carrying out 
5 one or more steps of the method of any one of claims 1, 3 or 13. 

37. A device for carrying out one or more steps of the method of any one of claims 1, 3 
or 13. 



91 



WO 2005/011376 



PCT/US2004/024200 



1/2 



Fig- 1 




antigen 




antibody 




WO 2005/011376 



PCT/US2004/024200 



2/2 



Fig. 2 




Cm o 

u 



en 
> 

CO 

►J 



o aw 

< CD CD 

u Q p « 

Eh U 05 O 

ri: >h 4 

En Eh O 0» 

0 CD En Eh 

E* S CD tH 

o eh o a 
u q 
< 



Eh 



e 

Cm u 
o 

Eh 



CJ 



o 
cd 

I 



o 



Eh 



Ol 
EH 



1 

■a 

•H 



CO 
CJ 

m 



> 



CO CD 

Eh CD 

Eh W 

U W CJ 
Eh Eh 



CJ 

o 

> Eh 
CD 



CD 



1 



8 

Eh 

cj 
o 



i 

CD 
O 

H Eh 



Eh 

W CD 



u w u 

E* Eh 
CD . g 



Eh 



cm 
in 



cm 



CD 

3 



CM 



Eh 
< 

PH 

P 

pa 
> 



5 CN 



a 

CD 
Eh 

Eh 

f: 



O Q 



W 
CD 

CD « 
CJ 

CJ 

CD H 
CD 

CD Pa 

Eh i-q 

a 

Eh • 



CN 



a 



c^ 

CN 




00 

U 
in 



CO 



CM 



CM 
<3« 



CM 
O 

in 



